论文概要
研究领域: NLP 作者: Ziyu Guo, Rain Liu, Xinyan Chen, Pheng-Ann Heng 发布时间: 2026-05-14 arXiv: 2605.15198
中文摘要
现有基于LLM的智能体架构框架都从单一视角描述系统:行业指南关注执行拓扑——数据如何流动,而认知科学综述关注认知功能——智能体做什么。仅凭任一轴线都无法区分架构上截然不同的系统:相同的编排器-工作者拓扑可以实现计划-执行、层级委托或对抗验证——三种在失败模式和设计权衡上根本不同的模式。本文提出一个二维分类法,结合(1)包含七个类别的认知功能轴(上下文工程、记忆、推理、行动、反思、协作、治理)和(2)包含六种结构原型的执行拓扑轴(链式、路由、并行、编排、循环、层级)。由此产生的7×6矩阵识别出27个命名模式,其中13个为原创命名。我们通过系统的跨轴分析证明正交性,详细定义八个代表性模式,并在四个真实领域验证描述性覆盖范围。跨域分析得出五条模式选择的经验法则,阐明环境约束与架构选择之间的关系。该框架为AI智能体架构设计提供了一种有原则的、框架中立且模型无关的词汇体系。
原文摘要
Visual reasoning, often interleaved with intermediate visual states, has emerged as a promising direction in the field. A straightforward approach is to directly generate images via unified models during reasoning, but this is computationally expensive and architecturally non-trivial. Recent alternatives include agentic reasoning through code or tool calls, and latent reasoning with learnable hidden embeddings. However, agentic methods incur context-switching latency from external execution, while latent methods lack task generalization and are difficult to train with autoregressive parallelization. To combine their strengths while mitigating their limitations, we propose ATLAS, a framework in which a single discrete 'word', termed as a functional token, serves both as an agentic operation...
--- *自动采集于 2026-05-15*
#论文 #arXiv #NLP #小凯