「Memory Intelligence Agent (MIA)」透過 Manager-Planner-Executor
「Memory Intelligence Agent (MIA)」透過 Manager-Planner-Executor 架構,讓 AI 從被動紀錄轉向主動策略演進。
「Memory Intelligence Agent (MIA)」是由華東師範大學與上海智慧研究院共同開發的記憶框架,旨在解決深度研究 Agent (DRA) 常見的記憶臃腫與推理效能瓶頸。該框架將 Agent 的記憶系統從單純的「資料儲存」轉化為「經驗學習」,透過 Manager-Planner-Executor 的三元架構,讓 Agent 能在複雜任務中自主演進,並實現更精準的推理。
核心架構與運作機制
MIA 徹底改變了傳統 Agent 依賴大量上下文堆疊的記憶方式,改採結構化的分工機制:
- Manager (管理者):作為非參數化記憶系統,負責儲存並壓縮歷史搜尋軌跡,消除冗餘資訊,避免記憶庫過度膨脹。
- Planner (規劃者):作為參數化記憶 Agent,負責根據問題制定搜尋策略,並透過「持續測試時間學習 (Continual Test-Time Learning)」在推理過程中即時優化策略。
- Executor (執行者):作為精準執行工具,根據規劃者的藍圖進行資訊搜尋與分析,並透過 ReAct 風格的循環(思考 → 行動 → 觀察)執行任務。
記憶演進與協作模式
MIA 不僅是儲存資訊,更強調記憶的「演化」與「協作」:
- 雙向記憶轉換:建立非參數化記憶(明確儲存的範例、工作流)與參數化記憶(模型內部的學習權重)之間的雙向轉換迴圈,實現記憶的動態更新。
- 交替強化學習 (Alternating RL):透過強化學習打破靜態記憶的僵局,確保 Manager、Planner 與 Executor 能夠協同運作,形成一個具備自我反思與判斷能力的完整系統。
- 自我修正機制:在執行過程中,若遇到失敗或停滯,Executor 會回報狀況,Planner 隨即進行反思並更新計畫,僅在必要時才重新嘗試,有效提升資源利用率。
技術突破與效能表現
實驗數據顯示,MIA 在多項基準測試中展現了顯著的優勢,證明了其在處理複雜任務時的有效性:
- 效能提升:在多項基準測試中,MIA 成為新的 SOTA (State-of-the-Art) 記憶 Agent,平均準確率提升約 5.5%,在多跳 (multi-hop) 等複雜任務上最高提升達 9.1%。
- 小模型逆襲:基於 7B 參數的 Qwen-2.5-VL 模型,在 MIA 框架加持下,其表現可媲美甚至超越部分封閉原始碼的大型模型,證明了該框架能有效釋放小模型的潛力。
- 跨領域能力:MIA 在文字與多模態任務中均表現強勁,並能透過測試時間學習持續優化,解決了傳統 Agent 記憶雜亂、推理失敗的痛點。
對現有技術的批判與反思
開發團隊明確指出當前深度研究 Agent 的根本瓶頸:
- 記憶臃腫:Agent 往往被困在海量長文本中,導致注意力分散,且維護成本高昂。
- 缺乏策略學習:現有 Agent 多數僅在記憶「結果是什麼 (what)」,卻完全忽略了「如何達成結果 (how)」的過程。
- 無效的規劃:傳統架構中,規劃者往往依賴無效的記憶檢索與不完整的上下文提示,導致執行者在缺乏準備的情況下進行研究,MIA 的出現正是為了終結這種「被動紀錄者」的模式,將其轉型為「主動策略家」。
I’m noticing some really big shifts in how AI models starts to handle memory.@ECNUER and others introduced Memory Intelligence Agent (MIA) that highlights the importance of storing the whole problem-solving journey – how to perform tasks.
— Ksenia_TuringPost (@TheTuringPost) April 9, 2026
It turns memory into something closer… pic.twitter.com/XfomDXO12x
1. When a new question comes in, the system searches its memory for similar past cases, prioritizing proven successes while also sampling rarer ones to avoid being too narrow. It keeps both successful and failed examples. pic.twitter.com/2vv6CnkmMU
— Ksenia_TuringPost (@TheTuringPost) April 9, 2026
2. Then the Planner takes the question and retrieved experiences and creates a step-by-step plan (like a chain-of-thought, but structured as a strategy).
— Ksenia_TuringPost (@TheTuringPost) April 9, 2026
It keeps improving during use, while solving tasks, through reinforcement learning.
3. Finally, the Executor follows this plan, interacting with tools (like search), collecting information, and trying to solve the task. This happens in a ReAct-style loop: think → act → observe → repeat.
— Ksenia_TuringPost (@TheTuringPost) April 9, 2026
Importantly, there’s a feedback loop. After execution, the Executor… pic.twitter.com/2UpF673EuD
Another important part of MIA is compression of the experience: long trajectories become structured summaries of workflows, images become captions, and redundant memory is replaced or merged.
— Ksenia_TuringPost (@TheTuringPost) April 9, 2026
MIA uses 2 types of memory at the same time:
— Ksenia_TuringPost (@TheTuringPost) April 9, 2026
- Non-parametric memory → explicit storage (retrieved examples, workflows)
- Parametric memory → knowledge inside the model (learned weights)
Results clearly show the advantages of this workflow:
— Ksenia_TuringPost (@TheTuringPost) April 9, 2026
• MIA becomes new SOTA among memory agents: +5.5 avg gain, ~53.6 accuracy
• Achieves up to +9.1 on complex tasks like multi-hop/in-house
• Small 7B model matches / beats larger closed models
• MIA is also strong on…
Paper: https://t.co/NjZJ2yU691
— Ksenia_TuringPost (@TheTuringPost) April 9, 2026
Code: https://t.co/RfLERnpVxs
Model: https://t.co/3LzkjAgS20
