Vibe-Trading Memory 机制：架构与实现原理

✨步子哥 · 2026-05-12T14:45:57+00:00

## 1. 总览：双层记忆模型项目中的「Memory」不是单一组件，而是 **两条清晰分离的链路**： | 层级 | 类型 | 生命周期 | 主要职责 | |------|------|----------|----------| | **工作区记忆** | `WorkspaceMemory` | 单次 `Age

✨步子哥 (steper) • 2026年05月12日 14:45

1. 总览：双层记忆模型

项目中的「Memory」不是单一组件，而是 两条清晰分离的链路：

层级	类型	生命周期	主要职责
工作区记忆	`WorkspaceMemory`	单次 `AgentLoop.run()` 内（可随 `AgentLoop` 实例复用，但每轮 `run()` 会重置部分状态）	`run_dir`、工具调用计数；为上下文压缩提供可注入的「状态摘要」
持久记忆	`PersistentMemory`	跨进程、跨会话，落盘于用户目录	索引 + 多条目 Markdown 文件；系统提示中的「冻结快照」+ 每轮用户消息上的「自动召回」

二者在模块上的边界在 WorkspaceMemory 的模块文档中写得很明确：单次运行内的共享状态由 WorkspaceMemory 承担；跨会话由 src.memory.persistent.PersistentMemory 承担。

"""Workspace memory: shared state across tool calls within a single run.

Lightweight runtime state — survives within one AgentLoop.run() invocation only.
Cross-session persistence is handled by src.memory.persistent.PersistentMemory.
"""

2. 工作区记忆（WorkspaceMemory）

2.1 数据结构与职责

WorkspaceMemory 是一个轻量 @dataclass：

run_dir：当前这次 agent 运行的根目录（绝对路径字符串），供工具解析相对路径。
counters：按工具名聚合的调用次数（Dict[str, int]）。

核心 API：

increment(key)：每次工具执行结束后由 AgentLoop 调用，用于统计。
to_summary()：生成给模型看的短文本（run_dir + counters），注释中明确说明其用途之一是在上下文压缩后仍能让模型知道「在干什么」。

@dataclass
class WorkspaceMemory:
    """Shared workspace state between tools during a single agent run.
    ...
    """
    run_dir: Optional[str] = None
    counters: Dict[str, int] = field(default_factory=dict)

    def increment(self, key: str) -> int:
        ...
    def to_summary(self) -> str:
        """Generate a state summary for the LLM.
        ...
        This summary survives context compression and helps the LLM
        remember what it was working on.
        ...
        """

2.2 在 AgentLoop 中的生命周期

run_dir 的确定（在构造消息列表之前）
- 若 memory.run_dir 已存在且路径在磁盘上存在，则沿用。
- 否则通过 RunStateStore.create_run_dir 新建，并写回 self.memory.run_dir。
工具参数中的 run_dir 归一化
_normalize_tool_run_dir 使用 memory.run_dir 作为基准，把相对路径（如 "."、子目录名）解析为绝对路径，保证工具在一致根目录下读写。
计数更新
每次工具结果落盘后调用 _update_memory(tool_name) → self.memory.increment(tool_name)。
与 _auto_compact 的配合
压缩对话时保留原始 system 消息对象，但在压缩产生的「交接摘要」用户消息末尾追加 Current agent state:\n{state_summary}，其中 state_summary = self.memory.to_summary()。这样即使长对话被折叠，工作区状态仍有一条独立注入路径。

注意：初始 system 里的 {memory_summary} 是构造 build_messages 那一刻的快照；中间迭代的工具计数不会自动写回该条 system，除非触发 _auto_compact（会在压缩块中再次注入 to_summary()）。

相关代码位置：agent/src/agent/loop.py（run_dir 与 ContextBuilder 创建、_auto_compact 内 state_summary、_update_memory）。

3. 持久记忆（PersistentMemory）

3.1 设计目标与存储布局

零外部依赖：不依赖向量库或数据库，纯文件系统。
默认目录：~/.vibe-trading/memory/（常量 MEMORY_BASE）。
文件角色：
- MEMORY.md：人类可读的索引（列表形式，带链接与一行描述）。
- 多个 {memory_type}_{slug}.md：具体记忆条目，带 YAML-like frontmatter + 正文。

"""PersistentMemory: file-based cross-session memory, zero external dependencies.

Storage layout:
    ~/.vibe-trading/memory/
    +-- MEMORY.md          # Index (< 200 lines)
    +-- user_prefs.md      # Individual memory entries with YAML frontmatter

3.2 内存中的「快照」语义

PersistentMemory 在 __init__ 时调用 _load_snapshot()：若存在 MEMORY.md，读取全文后只保留前 MAX_INDEX_LINES（200）行作为 _snapshot，且之后不会在 add/remove 时更新该快照。

设计意图（源码注释）：

会话开始时把索引注入 system prompt，有利于 prompt 缓存稳定。
磁盘会即时反映 save/forget，但当前会话内的 snapshot 字符串不变；下一次新建 PersistentMemory() 会重新加载。

class PersistentMemory:
    """File-based persistent memory that survives across sessions.

    Design:
        - Frozen snapshot injected into system prompt at session start (preserves prompt cache).
        - Disk writes via add()/remove() update files immediately but do NOT change the snapshot.
        - Next session picks up the updated state.

3.3 条目模型与扫描

MemoryEntry 为不可变数据类，聚合路径、标题、描述、memory_type、正文（截断至 MAX_ENTRY_CHARS = 8000）、mtime。

_scan_entries() 遍历目录下所有 *.md，跳过 MEMORY.md，用与技能文件共用的 parse_frontmatter（agent/src/agent/frontmatter.py）解析 frontmatter，得到结构化元数据 + body。

3.4 写入：`add`

文件名 slug：name.lower().strip() 后经正则清洗：保留字母数字、_、- 及 CJK 范围（避免纯 CJK 标题被压成相同 slug 导致静默覆盖）。
文件名：{memory_type}_{slug}.md（slug 最长 60）。
Frontmatter 字段：name、description（默认用 title）、type。
写文件后调用 _update_index：在 MEMORY.md 中按标题查找并替换行，或追加新行；整体仍受 MAX_INDEX_LINES 截断。

3.5 删除：`remove`

按精确标题 entry.title == name 匹配，删除对应文件并 _rebuild_index() 全量重建索引。

3.6 检索：`find_relevant`

非向量检索，基于 _tokenize 的集合交集打分：

ASCII：长度 ≥ 3 的 [a-zA-Z0-9]+ 小写 token（下划线不作为 token 的一部分，从而 mcp_wiring_test 可被 “mcp wiring” 这类查询命中）。
CJK：单字纳入 token 集合。
对每条目：meta_tokens = tokenize(title + description)，body_tokens = tokenize(body)。
得分 = |query ∩ meta| * METADATA_WEIGHT(2.0) + |query ∩ body| * 1.0。
排序：得分降序，同分按 modified_at 降序（较新优先）。
默认最多 MAX_RESULTS = 5 条（ContextBuilder 里自动召回用 3）。

def _tokenize(text: str) -> set[str]:
    """Split text into searchable tokens.

    ASCII words >= 3 chars + CJK individual characters. Underscores are
    treated as word boundaries so snake_case titles (e.g. ``mcp_wiring_test``)
    match natural-language queries (``"mcp wiring"``) as well as verbatim
    lookups.
    """
    ascii_tokens = set(re.findall(r"[a-zA-Z0-9]{3,}", text.lower()))
    cjk_tokens = set(re.findall(r"[\u4e00-\u9fff\u3400-\u4dbf]", text))
    return ascii_tokens | cjk_tokens

4. 与 LLM 上下文的集成（ContextBuilder）

4.1 System Prompt 中的两块「记忆」占位

_SYSTEM_PROMPT 模板中有：

{memory_summary}：来自 WorkspaceMemory.to_summary()（当前 run 的目录与工具计数）。
{memory_section}：仅当 persistent_memory.snapshot 非空时，拼接 ## Persistent Memory (cross-session) + 冻结索引全文。

        memory_section = ""
        if self._persistent_memory and self._persistent_memory.snapshot:
            memory_section = _MEMORY_SECTION.format(
                snapshot=self._persistent_memory.snapshot,
            )

        return _SYSTEM_PROMPT.format(
            ...
            memory_summary=self.memory.to_summary(),
            memory_section=memory_section,
            current_datetime=now.strftime("%A, %B %d, %Y %H:%M (local)"),
        )

系统提示中还包含对模型的行为指引：在适当时机使用 remember 工具保存偏好与洞见。

4.2 用户消息侧的「自动召回」

build_messages 在追加本轮用户消息前，若存在 persistent_memory，则对用户原文做 find_relevant(user_message, max_results=3)，若有结果，将摘要包在 <recalled-memories>...</recalled-memories> 中并置于用户消息前缀。

设计意图：

保持 system prompt 稳定（利于缓存）；
按查询动态注入相关记忆，避免把全部正文塞进 system。

        enriched = user_message
        if self._persistent_memory:
            try:
                recalls = self._persistent_memory.find_relevant(user_message, max_results=3)
                if recalls:
                    lines = [f"- **{r.title}** ({r.memory_type}): {r.body[:500]}" for r in recalls]
                    recall_block = "\n".join(lines)
                    enriched = (
                        f"<recalled-memories>\n{recall_block}\n</recalled-memories>\n\n"
                        f"{user_message}"
                    )
            except Exception as exc:
                logger.debug("Auto-recall failed: %s", exc)

异常被吞掉并打 debug 日志，避免召回失败阻断主流程。

5. 工具层：`remember` 与注册表依赖注入

5.1 RememberTool

工具名：remember。
action：save | recall | forget。
save 需要 title + content，可选 memory_type（user / feedback / project / reference）。
recall 需要 query。
forget 需要 title（与持久层按标题删除一致）。
is_readonly = False：参与写盘，在 AgentLoop 的批处理里会走串行路径。

构造器可注入 PersistentMemory；若未注入则自行 PersistentMemory()（默认目录）。实现见 agent/src/tools/remember_tool.py。

5.2 `build_registry` 的单例共享

build_registry(persistent_memory=pm) 时，对 RememberTool 特殊处理：用同一个 pm 实例注册，保证「上下文里的 PersistentMemory」与「工具写盘的 PersistentMemory」是同一对象。

            if cls is RememberTool and persistent_memory is not None:
                registry.register(cls(memory=persistent_memory))

5.3 服务端与 CLI 的装配

agent/src/session/service.py _run_with_agent：pm = PersistentMemory()，AgentLoop(..., persistent_memory=pm) 且 build_registry(persistent_memory=pm, ...)。
agent/cli.py：同样模式，并可设置 agent.memory.run_dir 覆盖。

6. 端到端数据流（概念架构）

7. 设计权衡与行为要点

快照冻结 vs 即时写盘
同一轮会话中，新保存的记忆会出现在磁盘和下一次 find_relevant（因扫描目录）中，但不会进入当前已发出的 system 里的 memory_section。若希望模型在同一会话后半段立刻在 system 里看到新索引，当前实现不会自动满足——这是为 prompt 前缀稳定 / 缓存 做的取舍。
自动召回 vs 显式 recall
每轮用户消息都会尝试关键词召回（最多 3 条、正文最多 500 字符），与工具 recall（最多 5 条、正文最多 2000 字符）形成「轻量自动 + 深度显式」组合。
WorkspaceMemory 与 system 中 memory_summary 的时效
memory_summary 仅在 build_system_prompt 调用时从当前 WorkspaceMemory 读取。单次 run() 内若未触发 _auto_compact，工具计数在内存中持续增长，但 system 首条消息中的文字不会自动刷新；压缩路径会把手写摘要与 to_summary() 一并塞进新的 user 消息，作为补偿。
索引长度上限
MEMORY.md 读写均限制 200 行，在记忆条目极多时需要依赖「自动召回扫文件」而非完整索引进 prompt。
检索模型局限
基于 token 交集，无语义相似度；短英文词（少于 3 个字母）不参与 ASCII token，可能影响部分查询。

8. 测试与质量护栏

agent/tests/test_persistent_memory.py：覆盖 add、索引更新、_tokenize、snake_case 召回、remove、_rebuild_index、snapshot 在新实例间持久等。
agent/tests/test_remember_tool.py：覆盖 save/recall/forget 的 JSON 契约与 PersistentMemory 集成。

9. 小结

问题	结论
Memory 分几层？	工作区（单次 run） + 持久（跨会话文件）
持久存在哪？	默认 `~/.vibe-trading/memory/`，`MEMORY.md` + 多条目 md
如何进模型上下文？	System：`memory_summary` + 可选冻结索引；User：`<recalled-memories>` 自动召回
模型如何写入？	`remember` 工具 → 与 `AgentLoop` 共享的 `PersistentMemory` 实例
检索怎么做？	正则分词 + 元数据加权交集评分，非向量

整体上，这是一套工程上简单、可审计、无外部服务的记忆方案，并通过「冻结索引 + 动态召回」在 缓存友好性 与 相关性 之间做了明确分割。

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力