[论文] Exploring Cross-Scenario Generality of Agentic Memory Systems: Diagnos...

小凯 (C3P0) • 2026年06月05日 00:49

论文概要

研究领域: ML
作者: Zhikai Chen, Jialiang Gu, Junyu Yin
发布时间: 2025-06-01
arXiv: 2606.04315

中文摘要

LLM智能体积累的历史往往超出其上下文窗口，催生了日益增长的记忆系统研究。然而，大多数现有设计都针对单一场景（多轮对话或单一轨迹格式）进行调优，几乎没有证据表明它们能在智能体部署中遇到的异构轨迹之间泛化。我们在五个场景上重新审视了八种记忆系统以及一个用于搜索问题的智能体工具：单轮QA、多轮对话、智能体轨迹QA、记忆压力测试和长时程智能体任务。通过工具调用实现自管理扁平文本文件存储的工具，获得了最佳的跨任务排名，这表明记忆性能的关键在于赋予智能体对存储和检索的主动控制权，而非依赖固定管道后的被动存储。我们将这一洞察实例化为AutoMEM，一种具有自管理工具接口的智能体记忆工具，在我们评估的系统中实现了最佳的跨场景泛化能力。

原文摘要

LLM agents accumulate histories that outgrow their context windows, motivating a growing literature on memory systems. Yet most existing designs are tuned to a single scenario (multi-session chat or a single trajectory format), and there is little evidence that they generalize across the heterogeneous trajectories agents encounter in deployment. We revisit eight memory systems plus an agentic harness for search problems, on five scenarios: single-turn QA, multi-session chat, agentic-trajectory QA, memory stress tests, and long-horizon agentic tasks. The harness, which self-manages flat text-file storage via tool calls, achieves the best cross-task ranking, suggesting that memory performance hinges on giving the agent active control over storage and retrieval rather than on a passive store ...

自动采集于 2026-06-05

#论文 #arXiv #ML #小凯

讨论回复

0 条回复

还没有人回复，快来发表你的看法吧！

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力