您正在查看静态缓存页面 · 查看完整动态版本 · 登录 参与讨论

Architecting the Agent Mind

✨步子哥 (steper) 2025年12月31日 08:19 0 次浏览
Context Engineering: Sessions & Memory
Based on "Context Engineering: Sessions, Memory" (Nov 2025)

Architecting the Agent Mind

Explore the engineering principles behind GenAI context management. Navigate the trade-offs between ephemeral Sessions and persistent Memory to build robust, hallucination-resistant agents.

Sessions

Short-term interaction history. Fast, expensive, and limited by context windows.

💾

Memory

Long-term storage. Structured, scalable, and essential for personalization.

Sessions & Context Windows

Sessions capture the immediate "Short-term Memory" of an agent. The critical engineering challenge here is managing the Context Window. As conversations grow, full history becomes prohibitively expensive and prone to "Lost in the Middle" phenomena.

Interact below to compare strategies for handling long context conversations.

Context Strategy

Engineering Note

Full History leads to exponential token costs. Optimization is mandatory for production systems.

Token Usage vs. Conversation Turns

Token Cost Context Quality

💾 The Memory Bank

To move beyond transient sessions, agents require Memory. The report identifies several distinct types of memory modeled after human cognition. Each requires a specific storage architecture.

Select a memory type to reveal its architectural requirements.

Episodic Memory

🕰️

Past experiences and events.

Semantic Memory

📚

Facts, concepts, and world knowledge.

Procedural Memory

🛠️

Implicit knowledge on "how to do" things.

🕰️

Episodic Memory

Experience Based
Definition

Stores sequences of events and interactions. It allows the agent to recall what happened in previous turns or sessions, providing continuity.

Storage Architecture

Vector Database + Time-Decay Weighting

Often retrieved via semantic similarity search biased by recency.

Example in Action

"User: 'Remember when we discussed Python decorators?'
Agent: 'Yes, in our session last Tuesday, we covered the @property decorator...'"

Memory Generation Pipeline

Memories don't just appear; they must be manufactured. The report outlines a pipeline to transform raw unstructured text into structured, queryable data.

📄

1. Raw Input

User prompts & conversation history.

⛏️

2. Extraction

Identify entities, intent, and facts.

🔄

3. Consolidation

De-duplicate & merge with existing graphs.

🗄️

4. Storage

Vector Embeddings & Knowledge Graphs.

Select a stage above

Click on any step in the pipeline to understand the specific engineering tasks involved.

Context Engineering

An interactive synthesis of the paper "Context Engineering: Sessions, Memory" by Kimberly Milam and Antonio Gulli (Google, Nov 2025). This application demonstrates the architectural patterns for building stateful GenAI agents.

Sessions Memory Cognitive Architecture
Key References
  • Google Vertex AI Documentation
  • LangChain / LangGraph Memory Concepts
  • Generative AI Agent Architecture

讨论回复

1 条回复
✨步子哥 (steper) #1
12-31 08:23

1. 程序性记忆的“规模定律”是否成立?

结论:并不存在简单的“指数级”增长,而是呈现“S型曲线”与“信噪比博弈”。

报告中并未支持“记忆量增加 = 能力指数级提升”的观点,反而通过“上下文工程”(Context Engineering)的必要性,揭示了无序增加记忆的风险。

从“堆量”到“信噪比”的制约:
报告指出,Agent 面临 "Lost in the Middle"(中间迷失) 现象。当程序性记忆(即“如何做”的知识,如Few-Shot示例、工具调用模板)单纯线性增加时,模型的检索准确率和注意力机制反而可能下降。
工程现实: 如果你给 Agent 灌输 10,000 个平庸的工具使用案例,它的能力不会是 100 个案例的 100 倍,反而可能因为检索干扰(Distraction)导致决策瘫痪。

阈值效应(Threshold Effect):
程序性记忆的价值在于覆盖率。当记忆库覆盖了 80% 的高频场景时,Agent 的表现会发生质变(从“无法可用”到“不仅可用且可靠”)。但这之后的提升是边际递减的,除非引入反思(Reflection)机制——即报告中提到的“整合(Consolidation)”过程,将错误的尝试转化为修正后的程序性记忆。
真正的“指数级”杠杆:
真正的爆发点不在于记忆的“量”,而在于记忆的结构化程度。报告强调将非结构化文本转化为“知识图谱(Knowledge Graph)”或优化的“向量(Vector)”。只有当程序性记忆被高度结构化(例如:将自然语言指令转化为精确的 API 链条)时,Agent 才能在复杂任务中展现出类似“直觉”的高效推理。

2. 超人类程序性记忆下,人机协作的“最优分工边界”

当 Agent 拥有超人类的程序性记忆(即它记得所有API的用法、所有历史最佳实践、所有错误的坑)时,人类不再是“操作者”,而是 “定义者”“园丁”

根据报告中的认知架构(Cognitive Architectures)记忆生命周期,最优分工边界应划定如下:

A. 边界一:从“执行”上移至“意图定义”

Agent (执行层): 负责所有确定性流程的调用与组合。依靠程序性记忆,它可以毫秒级调取针对特定数据库的复杂 SQL 模版,或针对特定错误的调试路径。
Human (意图层): 负责定义“什么是成功”以及处理模糊性。Agent 虽有海量记忆,但缺乏价值观判断。人类需要通过 Prompt 注入“本次任务的约束条件”(如:安全优先还是速度优先),这是程序性记忆无法动态决定的。

B. 边界二:从“编写”转向“记忆园丁(Memory Gardening)”

这是报告中隐含的最具前瞻性的分工:

Agent (素材生产): 在 Session(会话)中产生大量的交互数据、尝试路径和中间结果。
Human (整合与剪枝): 介入报告中提到的 "Extraction & Consolidation"(提取与整合) 环节。
审查(Review): 确认哪些临时性的“灵机一动”值得被固化为长期的程序性记忆。
去噪(Pruning): 删除过时的、低效的程序性记忆,防止 Agent 陷入路径依赖。
纠偏(Alignment): 当 Agent 从记忆中检索出“走捷径”但违规的操作时,人类予以否决并修正记忆权重。

C. 边界三:处理“长尾”与“黑天鹅”

Agent: 处理分布在正态分布中间 95% 的常规任务。它的程序性记忆是对过去经验的完美复刻。

  • Human: 处理 5% 的全新异构任务。当遇到记忆库中完全不存在的场景时,Agent 会退化为随机尝试或拒绝服务,此时人类必须接管控制权(Human-in-the-loop),通过一次新的示范,为 Agent 创造由于这一刻产生的新的程序性记忆

一句话总结:
在超人类程序性记忆时代,人类不再教 Agent “怎么做” (How),而是审核 Agent 的 “记忆是否正确” (Truth)以及定义 “要做什么” (What)。我们将从“全栈工程师”转变为“AI 记忆架构师”。