Graphiti 深度使用指南：为 AI Agent 构建时态知识图谱

> 参考作品风格：技术文档写作参考了 Dan Abramov 的 "Overreacted" 博客的清晰叙述方式，以及 Martin Fowler 对复杂概念的渐进式拆解手法。

---

一、项目概览

1.1 什么是 Graphiti？

Graphiti 是 Zep AI 团队开源的时态知识图谱引擎，专为 AI Agent 的动态记忆需求而设计。与静态的知识图谱不同，Graphiti 追踪事实随时间的演变——它不仅记录"发生了什么"，还记录"何时发生"以及"何时不再成立"。

核心定位：Graphiti 是 Zep 记忆层的开源核心引擎，负责构建和维护时态上下文图谱。

1.2 为什么需要时态知识图谱？

传统 RAG（检索增强生成）系统在处理以下场景时力不从心：

场景	传统 RAG 的局限	Graphiti 的解决方案
事实变更	难以处理过时信息	显式追踪事实有效期
对话记忆	无法维护跨会话上下文	增量构建 + 时态检索
结构化数据	难以融合异构数据源	统一处理文本/JSON
实时更新	需要批量重计算	毫秒级增量集成
历史查询	只能查询当前状态	时间点查询能力

1.3 性能基准

根据论文《Zep: A Temporal Knowledge Graph Architecture for Agent Memory》的评估结果：

DMR 基准测试：准确率 94.8%（优于 MemGPT 的 93.4%）
LongMemEval 基准测试：准确率提升最高达 18.5%，响应延迟降低 90%
Token 成本：相比传统方法减少 98%
查询延迟：亚秒级响应（vs GraphRAG 的数秒到数十秒）

---

二、核心概念详解

2.1 上下文图谱（Context Graph）

Graphiti 构建的图谱与传统知识图谱有本质区别：

传统知识图谱          Graphiti 时态图谱
     │                      │
     ▼                      ▼
┌──────────┐          ┌─────────────────┐
│  张三    │          │    张三         │
│   │      │          │  ┌──────────┐   │
│   │ 工作 │          │  │ 职位摘要 │   │
│   ▼      │          │  │ 随时间演化│   │
│  字节    │          │  └──────────┘   │
└──────────┘          └─────────────────┘
                           │
          ┌────────────────┼────────────────┐
          ▼                ▼                ▼
    ┌──────────┐    ┌──────────┐    ┌──────────┐
    │ 2023年前 │    │ 2023-2024│    │ 2024年后 │
    │ 自由职业 │    │ 字节跳动 │    │ 创业公司 │
    └──────────┘    └──────────┘    └──────────┘

2.2 核心数据模型

#### 2.2.1 片段（Episode）

片段是 Graphiti 中的基本信息单元，代表一次数据摄入事件：

class Episode:
    """原始数据源，完整保留输入内容"""
    uuid: str              # 唯一标识
    name: str              # 片段名称
    content: str           # 原始内容（文本或 JSON）
    source_type: EpisodeType  # text 或 json
    source_description: str   # 来源描述
    reference_time: datetime  # 事件发生时间（世界时间）
    created_at: datetime   # 摄入时间（系统时间）

双时态模型：

事件时间（Valid Time）：reference_time，事实在现实世界中为真的时间
系统时间（Transaction Time）：created_at，数据进入系统的时间

#### 2.2.2 实体（Entity）

class EntityNode:
    """随时间演化的概念节点"""
    uuid: str
    name: str              # 实体名称
    labels: List[str]      # 类型标签（如 Person, Company）
    summary: str           # 当前摘要（会随新信息更新）
    attributes: Dict       # 结构化属性（自定义实体时）
    episodes: List[str]    # 来源片段 UUID 列表（溯源）

#### 2.2.3 关系/事实（Edge）

class EntityEdge:
    """带有时态窗口的事实"""
    uuid: str
    source_node_uuid: str   # 起始实体
    target_node_uuid: str   # 目标实体
    name: str               # 关系名称
    fact: str               # 自然语言事实描述
    valid_at: datetime      # 开始生效时间
    invalid_at: datetime    # 失效时间（None 表示仍然有效）
    episodes: List[str]     # 来源片段

时态事实管理示例：

初始事实：张三在字节跳动工作（2023-01-15 至 2024-06-30）
         │
         ▼ 新信息摄入
         
新增事实：张三加入创业公司（2024-07-01 至今）
         │
         ▼ 自动处理
         
旧事实被标记：invalid_at = 2024-07-01
新事实被创建：valid_at = 2024-07-01, invalid_at = None

2.3 架构概览

┌─────────────────────────────────────────────────────────────┐
│                        数据输入层                            │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐        │
│  │ 对话文本 │  │ JSON数据 │  │ 文档    │  │ 消息流   │        │
│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘        │
└───────┼────────────┼────────────┼────────────┼─────────────┘
        │            │            │            │
        └────────────┴──────┬─────┴────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                      处理管道（Pipeline）                     │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐       │
│  │ 实体提取  │→│ 关系提取  │→│ 去重合并  │→│ 摘要生成  │       │
│  │ (LLM)    │ │ (LLM)    │ │ (向量+规则)│ │ (LLM)    │       │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘       │
│                              │                              │
│                              ▼                              │
│  ┌────────────────────────────────────────────────────┐   │
│  │              嵌入生成（Embedding）                   │   │
│  │   实体嵌入 + 边嵌入 + BM25 索引                      │   │
│  └────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                      存储层                                  │
│  ┌────────────────────────────────────────────────────┐   │
│  │              Neo4j / FalkorDB / Kuzu               │   │
│  │  (节点 + 边 + 向量索引 + 全文索引)                   │   │
│  └────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                      检索层                                  │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐       │
│  │语义搜索  │ │BM25关键词 │ │图遍历    │ │时态过滤  │       │
│  │(向量)    │ │(全文)    │ │(邻居扩展)│ │(时间窗口)│       │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘       │
└─────────────────────────────────────────────────────────────┘

---

三、快速开始

3.1 环境准备

#### 3.1.1 安装 Graphiti

# 基础安装（Neo4j 支持）
pip install graphiti-core

# 带 FalkorDB 支持
pip install graphiti-core[falkordb]

# 带多 LLM 提供商支持
pip install graphiti-core[anthropic,groq,google-genai]

# 使用 uv（推荐）
uv add graphiti-core

#### 3.1.2 启动 Neo4j

# Docker 方式（推荐用于开发）
docker run -d \
  --name neo4j \
  -p 7474:7474 \
  -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/password \
  -e NEO4J_PLUGINS='["apoc", "gds"]' \
  neo4j:5.26

# 访问 Neo4j Browser: http://localhost:7474

#### 3.1.3 配置环境变量

# .env 文件
OPENAI_API_KEY=your_openai_api_key

NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=password

3.2 第一个示例

import asyncio
import os
from datetime import datetime, timezone
from dotenv import load_dotenv

from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType
from graphiti_core.utils.maintenance.graph_data_operations import clear_data

load_dotenv()

async def main():
    # 初始化 Graphiti
    graphiti = Graphiti(
        os.environ["NEO4J_URI"],
        os.environ["NEO4J_USER"],
        os.environ["NEO4J_PASSWORD"]
    )
    
    # 可选：清空数据（开发调试时使用）
    # await clear_data(graphiti.driver)
    
    # 创建索引和约束（首次运行时需要）
    await graphiti.build_indices_and_constraints()
    
    # 添加第一个片段
    result = await graphiti.add_episode(
        name="初次见面",
        episode_body="我叫李明，是一名软件工程师，目前在阿里巴巴工作。",
        source=EpisodeType.text,
        source_description="自我介绍",
        reference_time=datetime.now(timezone.utc)
    )
    
    print(f"✅ 片段已添加，生成了 {len(result.nodes)} 个实体和 {len(result.edges)} 条关系")
    
    # 搜索
    results = await graphiti.search("李明在哪里工作？", num_results=5)
    
    print("\n🔍 搜索结果:")
    for edge in results:
        print(f"  - {edge.fact}")
        print(f"    有效时间: {edge.valid_at} - {edge.invalid_at or '至今'}")
    
    # 关闭连接
    await graphiti.close()

if __name__ == "__main__":
    asyncio.run(main())

运行结果：

✅ 片段已添加，生成了 2 个实体和 1 条关系

🔍 搜索结果:
  - 李明在阿里巴巴工作
    有效时间: 2025-04-06 10:30:00+00:00 - 至今

---

四、完整使用教程

4.1 初始化与连接

#### 4.1.1 配置 LLM 客户端

Graphiti 默认使用 OpenAI，但支持多种 LLM 提供商：

from graphiti_core.llm_client import OpenAIClient, AnthropicClient
from graphiti_core.embedder import OpenAIEmbedder

# OpenAI（默认）
graphiti = Graphiti(neo4j_uri, neo4j_user, neo4j_password)

# 自定义 OpenAI 配置
from graphiti_core.llm_client.config import LLMConfig

llm_config = LLMConfig(
    api_key=os.environ["OPENAI_API_KEY"],
    model="gpt-4o-mini",  # 成本更低的选择
    temperature=0.0
)
graphiti = Graphiti(
    neo4j_uri, neo4j_user, neo4j_password,
    llm_client=OpenAIClient(config=llm_config)
)

# Anthropic Claude
anthropic_client = AnthropicClient(api_key=os.environ["ANTHROPIC_API_KEY"])
graphiti = Graphiti(
    neo4j_uri, neo4j_user, neo4j_password,
    llm_client=anthropic_client
)

#### 4.1.2 配置嵌入模型

from graphiti_core.embedder import OpenAIEmbedder

# 自定义嵌入模型
custom_embedder = OpenAIEmbedder(
    api_key=os.environ["OPENAI_API_KEY"],
    model="text-embedding-3-small",  # 更便宜的选项
    dimensions=1536
)

graphiti = Graphiti(
    neo4j_uri, neo4j_user, neo4j_password,
    embedder=custom_embedder
)

4.2 添加片段（Episodes）

#### 4.2.1 文本片段

# 基础文本片段
await graphiti.add_episode(
    name="产品需求讨论",
    episode_body="""
    产品经理张三提出新功能需求：
    - 增加用户画像分析模块
    - 预计上线时间：2025年Q2
    - 负责人：李四
    """,
    source=EpisodeType.text,
    source_description="会议纪要",
    reference_time=datetime(2025, 1, 15, 10, 0, tzinfo=timezone.utc)
)

#### 4.2.2 JSON 片段

import json

# 结构化数据
await graphiti.add_episode(
    name="用户信息",
    episode_body=json.dumps({
        "user_id": "U12345",
        "name": "王五",
        "role": "高级架构师",
        "skills": ["Python", "Kubernetes", "Graph Database"],
        "department": "基础架构部",
        "join_date": "2023-03-15"
    }),
    source=EpisodeType.json,
    source_description="HR系统数据",
    reference_time=datetime.now(timezone.utc)
)

#### 4.2.3 批量添加片段

episodes = [
    {
        "name": "会话1",
        "episode_body": "用户询问如何部署服务",
        "source": EpisodeType.text,
        "source_description": "客服对话",
        "reference_time": datetime.now(timezone.utc)
    },
    {
        "name": "会话2", 
        "episode_body": "用户反馈部署成功",
        "source": EpisodeType.text,
        "source_description": "客服对话",
        "reference_time": datetime.now(timezone.utc) + timedelta(minutes=30)
    }
]

for ep in episodes:
    await graphiti.add_episode(**ep)

4.3 检索方法详解

#### 4.3.1 基础混合搜索

# 默认搜索：混合语义 + BM25 + 图遍历
results = await graphiti.search(
    query="谁负责用户画像模块？",
    num_results=10
)

# 结果包含 EntityEdge 对象
for edge in results:
    print(f"事实: {edge.fact}")
    print(f"置信度: {edge.score}")
    print(f"来源: {edge.episodes}")

#### 4.3.2 中心节点重排序

# 第一次搜索获取相关节点
initial_results = await graphiti.search("张三")
center_node = initial_results[0].source_node_uuid

# 以该节点为中心重排序，考虑图距离
results = await graphiti.search(
    query="张三的团队成员有哪些？",
    center_node_uuid=center_node,
    num_results=10
)

重排序原理：

┌──────────────────────────────────────────┐
│           中心节点重排序示例               │
├──────────────────────────────────────────┤
│                                          │
│   ┌─────┐      ┌─────┐      ┌─────┐     │
│   │张三 │──────│李四 │──────│王五 │     │
│   └──┬──┘      └─────┘      └─────┘     │
│      │                                   │
│      │ 2跳                                │
│      ▼                                   │
│   ┌─────┐      ┌─────┐                   │
│   │赵六 │──────│孙七 │                   │
│   └─────┘      └─────┘                   │
│                                          │
│ 搜索"同事"时，即使语义相似度相同，          │
│ 距离张三更近的节点（李四）会排在前面         │
│                                          │
└──────────────────────────────────────────┘

#### 4.3.3 节点检索

from graphiti_core.search.search_config_recipes import NODE_HYBRID_SEARCH_RRF

# 使用预配置搜索节点（而非边）
node_config = NODE_HYBRID_SEARCH_RRF.model_copy(deep=True)
node_config.limit = 5

node_results = await graphiti._search(
    query="阿里巴巴",
    config=node_config
)

for node in node_results:
    print(f"实体: {node.name}")
    print(f"摘要: {node.summary}")
    print(f"类型: {node.labels}")

4.4 预定义搜索配方

Graphiti 提供多种预配置的搜索策略：

from graphiti_core.search.search_config_recipes import (
    COMBINED_HYBRID_SEARCH_RRF,      # 混合搜索 + RRF重排序
    COMBINED_HYBRID_SEARCH_MMR,      # 混合搜索 + MMR多样性重排序
    COMBINED_HYBRID_SEARCH_CROSS_ENCODER,  # 交叉编码器重排序
    EDGE_HYBRID_SEARCH_RRF,          # 仅边检索
    EDGE_HYBRID_SEARCH_NODE_DISTANCE,  # 图距离重排序
    NODE_HYBRID_SEARCH_RRF,          # 节点检索
)

# 使用 MMR 增加结果多样性（避免冗余）
mmr_config = COMBINED_HYBRID_SEARCH_MMR.model_copy(deep=True)
mmr_config.limit = 10
mmr_config.config.mmr_lambda = 0.5  # 平衡相关性与多样性

results = await graphiti._search(
    query="Graphiti 的优势",
    config=mmr_config
)

4.5 自定义实体类型

#### 4.5.1 定义实体模型

from pydantic import BaseModel, Field

class Person(BaseModel):
    """人类实体"""
    first_name: str | None = Field(description="名")
    last_name: str | None = Field(description="姓")
    occupation: str | None = Field(description="职业")
    email: str | None = Field(description="邮箱")

class Company(BaseModel):
    """公司实体"""
    name: str | None = Field(description="公司名")
    industry: str | None = Field(description="行业")
    founded_year: int | None = Field(description="成立年份")
    headquarters: str | None = Field(description="总部所在地")

class Product(BaseModel):
    """产品实体"""
    name: str | None = Field(description="产品名")
    category: str | None = Field(description="类别")
    price: float | None = Field(description="价格")

#### 4.5.2 定义关系模型

class WorksFor(BaseModel):
    """雇佣关系"""
    position: str | None = Field(description="职位")
    department: str | None = Field(description="部门")
    start_date: str | None = Field(description="入职日期")

class Founded(BaseModel):
    """创立关系"""
    pass

class Produces(BaseModel):
    """生产关系"""
    pass

#### 4.5.3 使用自定义类型

# 定义类型映射
entity_types = {
    "Person": Person,
    "Company": Company,
    "Product": Product
}

edge_types = {
    "WORKS_FOR": WorksFor,
    "FOUNDED": Founded,
    "PRODUCES": Produces
}

# 定义允许的实体-关系组合
edge_type_map = {
    ("Person", "Company"): ["WORKS_FOR", "FOUNDED"],
    ("Company", "Product"): ["PRODUCES"],
    ("Person", "Product"): ["USES", "CREATED"]
}

# 使用自定义类型添加片段
result = await graphiti.add_episode(
    name="公司介绍",
    episode_body="""
    马云在1999年创立了阿里巴巴。
    阿里巴巴是一家电商公司，总部位于杭州。
    阿里巴巴生产了淘宝、天猫等产品。
    """,
    source=EpisodeType.text,
    source_description="公司历史",
    reference_time=datetime.now(timezone.utc),
    entity_types=entity_types,
    edge_types=edge_types,
    edge_type_map=edge_type_map
)

# 检查提取结果
for node in result.nodes:
    if "Person" in node.labels:
        print(f"人员: {node.attributes.get('first_name')} {node.attributes.get('last_name')}")
        print(f"职业: {node.attributes.get('occupation')}")

4.6 高级功能

#### 4.6.1 社区检测

from graphiti_core.utils.maintenance.community_operations import build_communities

# 构建社区（基于图聚类）
communities = await build_communities(graphiti.driver)

for community in communities:
    print(f"社区 ID: {community.uuid}")
    print(f"摘要: {community.summary}")
    print(f"成员数: {len(community.members)}")

#### 4.6.2 图维护操作

from graphiti_core.utils.maintenance.graph_data_operations import (
    clear_data,
    remove_episode
)

# 删除特定片段（会级联删除相关实体和关系）
await remove_episode(graphiti.driver, episode_uuid="ep_xxx")

# 清空所有数据（危险操作）
# await clear_data(graphiti.driver)

# 重建索引
await graphiti.build_indices_and_constraints()

#### 4.6.3 历史查询

# 查询特定时间点的图谱状态
from graphiti_core.search.search_config import SearchConfig

config = SearchConfig(
    limit=10,
    # 只查询在指定时间有效的事实
    effective_at=datetime(2024, 6, 15, tzinfo=timezone.utc)
)

results = await graphiti._search(
    query="张三的职位",
    config=config
)

---

五、实战示例：对话记忆系统

5.1 系统架构

┌─────────────────────────────────────────────────────────────┐
│                    对话记忆系统架构                          │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌──────────┐    ┌──────────┐    ┌──────────────────┐   │
│   │ 用户输入  │───→│ 意图理解  │───→│   Graphiti 存储   │   │
│   └──────────┘    └──────────┘    │                  │   │
│                                    │  ┌────────────┐ │   │
│   ┌──────────┐    ┌──────────┐    │  │ 实体节点   │ │   │
│   │ 历史会话  │←───│ 上下文检索│←───│  │ 关系边     │ │   │
│   └──────────┘    └──────────┘    │  │ 片段溯源   │ │   │
│                                    │  └────────────┘ │   │
│   ┌──────────┐    ┌──────────┐    │                  │   │
│   │ LLM 回复  │←───│ 回复生成  │←───│  时态推理引擎   │   │
│   └──────────┘    └──────────┘    └──────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

5.2 完整实现

import asyncio
import json
from datetime import datetime, timezone, timedelta
from typing import List, Optional

from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType, EpisodicNode
from graphiti_core.edges import EntityEdge

class ConversationMemory:
    """基于 Graphiti 的对话记忆系统"""
    
    def __init__(self, neo4j_uri: str, neo4j_user: str, neo4j_password: str):
        self.graphiti = Graphiti(neo4j_uri, neo4j_user, neo4j_password)
        self.session_id: Optional[str] = None
    
    async def initialize(self):
        """初始化数据库"""
        await self.graphiti.build_indices_and_constraints()
    
    async def start_session(self, user_id: str) -> str:
        """开始新会话"""
        self.session_id = f"session_{user_id}_{datetime.now().timestamp()}"
        
        # 记录会话开始
        await self.graphiti.add_episode(
            name=self.session_id,
            episode_body=json.dumps({
                "event": "session_start",
                "user_id": user_id,
                "timestamp": datetime.now(timezone.utc).isoformat()
            }),
            source=EpisodeType.json,
            source_description="会话事件",
            reference_time=datetime.now(timezone.utc)
        )
        return self.session_id
    
    async def add_message(self, role: str, content: str, 
                         metadata: Optional[dict] = None) -> None:
        """添加对话消息"""
        if not self.session_id:
            raise ValueError("请先调用 start_session()")
        
        message_data = {
            "session_id": self.session_id,
            "role": role,
            "content": content,
            "metadata": metadata or {},
            "timestamp": datetime.now(timezone.utc).isoformat()
        }
        
        await self.graphiti.add_episode(
            name=f"{self.session_id}_{role}",
            episode_body=json.dumps(message_data),
            source=EpisodeType.json,
            source_description="对话消息",
            reference_time=datetime.now(timezone.utc)
        )
    
    async def get_relevant_context(self, query: str, 
                                    max_results: int = 5) -> List[EntityEdge]:
        """获取与查询相关的上下文"""
        # 基础搜索
        results = await self.graphiti.search(
            query=query,
            num_results=max_results
        )
        
        return results
    
    async def get_conversation_history(self, 
                                       lookback_minutes: int = 30) -> List[dict]:
        """获取最近的对话历史"""
        cutoff_time = datetime.now(timezone.utc) - timedelta(minutes=lookback_minutes)
        
        # 这里简化处理，实际应该使用更精确的查询
        results = await self.graphiti.search(
            query=f"session_id:{self.session_id}",
            num_results=20
        )
        
        history = []
        for edge in results:
            # 从边反推原始消息
            for episode_uuid in edge.episodes:
                # 获取片段详情（需要额外实现）
                pass
        
        return history
    
    async def summarize_user_preferences(self, user_id: str) -> dict:
        """总结用户偏好"""
        # 搜索与用户相关的所有事实
        results = await self.graphiti.search(
            query=f"用户 {user_id} 偏好 喜欢 不喜欢",
            num_results=20
        )
        
        preferences = {
            "likes": [],
            "dislikes": [],
            "topics": set()
        }
        
        for edge in results:
            fact = edge.fact.lower()
            if "喜欢" in fact or "偏好" in fact:
                preferences["likes"].append(edge.fact)
            if "不喜欢" in fact or "讨厌" in fact:
                preferences["dislikes"].append(edge.fact)
            
            # 提取话题
            # 这里可以进一步使用 NLP 提取实体
        
        return preferences
    
    async def close(self):
        """关闭连接"""
        await self.graphiti.close()


# 使用示例
async def demo_conversation_memory():
    """演示对话记忆系统"""
    
    # 初始化
    memory = ConversationMemory(
        neo4j_uri="bolt://localhost:7687",
        neo4j_user="neo4j",
        neo4j_password="password"
    )
    await memory.initialize()
    
    # 开始会话
    session_id = await memory.start_session(user_id="user_001")
    print(f"会话开始: {session_id}\n")
    
    # 模拟对话
    conversations = [
        ("user", "你好，我想了解 Graphiti 这个项目。"),
        ("assistant", "Graphiti 是一个时态知识图谱引擎，用于构建 AI Agent 的记忆系统。"),
        ("user", "听起来很有意思，它和传统的知识图谱有什么区别？"),
        ("assistant", "主要区别在于 Graphiti 追踪事实随时间的变化，支持增量更新和历史查询。"),
        ("user", "明白了。对了，我叫张三，是一名后端工程师。"),
        ("assistant", "很高兴认识你，张三！作为后端工程师，你对 Graphiti 的哪些技术细节最感兴趣？"),
        ("user", "我对它的 Neo4j 集成和检索性能比较好奇。"),
    ]
    
    for role, content in conversations:
        print(f"[{role}]: {content}")
        await memory.add_message(role, content)
    
    print("\n" + "="*50 + "\n")
    
    # 查询相关上下文
    print("🔍 查询'Graphiti 的特点':")
    context = await memory.get_relevant_context("Graphiti 的特点")
    for edge in context[:3]:
        print(f"  - {edge.fact}")
    
    print("\n🔍 查询'用户的职业':")
    context = await memory.get_relevant_context("用户的职业")
    for edge in context[:3]:
        print(f"  - {edge.fact}")
    
    await memory.close()


if __name__ == "__main__":
    asyncio.run(demo_conversation_memory())

5.3 运行效果

会话开始: session_user_001_1712385600.123456

[user]: 你好，我想了解 Graphiti 这个项目。
[assistant]: Graphiti 是一个时态知识图谱引擎，用于构建 AI Agent 的记忆系统。
[user]: 听起来很有意思，它和传统的知识图谱有什么区别？
[assistant]: 主要区别在于 Graphiti 追踪事实随时间的变化，支持增量更新和历史查询。
[user]: 明白了。对了，我叫张三，是一名后端工程师。
[assistant]: 很高兴认识你，张三！作为后端工程师，你对 Graphiti 的哪些技术细节最感兴趣？
[user]: 我对它的 Neo4j 集成和检索性能比较好奇。

==================================================

🔍 查询'Graphiti 的特点':
  - Graphiti 是一个时态知识图谱引擎
  - Graphiti 追踪事实随时间的变化
  - Graphiti 支持增量更新和历史查询

🔍 查询'用户的职业':
  - 张三是后端工程师

---

六、性能优化与最佳实践

6.1 索引优化

# 首次部署时确保索引已创建
await graphiti.build_indices_and_constraints()

# Neo4j 中的索引类型：
# - 向量索引：用于语义搜索
# - 全文索引（BM25）：用于关键词搜索
# - 属性索引：用于时态过滤

6.2 批量处理

import asyncio
from concurrent.futures import ThreadPoolExecutor

async def batch_add_episodes(graphiti, episodes, batch_size=10):
    """批量添加片段，控制并发"""
    semaphore = asyncio.Semaphore(5)  # 限制并发数
    
    async def add_with_limit(ep):
        async with semaphore:
            return await graphiti.add_episode(**ep)
    
    # 分批处理
    for i in range(0, len(episodes), batch_size):
        batch = episodes[i:i+batch_size]
        tasks = [add_with_limit(ep) for ep in batch]
        results = await asyncio.gather(*tasks)
        print(f"已处理 {i+len(batch)}/{len(episodes)}")
    
    return results

6.3 嵌入模型选择

场景	推荐模型	维度	成本	质量
开发测试	text-embedding-3-small	1536	低	中
生产环境	text-embedding-3-large	3072	中	高
中文优化	text-embedding-ada-002	1536	中	中
本地部署	sentence-transformers/all-MiniLM-L6-v2	384	免费	中

6.4 LLM 成本控制

# 实体提取使用便宜模型
extraction_config = LLMConfig(
    model="gpt-4o-mini",
    temperature=0.0
)

# 摘要生成使用更强的模型
summary_config = LLMConfig(
    model="gpt-4o",
    temperature=0.3
)

6.5 监控与调试

# 启用详细日志
import logging
logging.basicConfig(level=logging.DEBUG)

# 检查图谱统计
async def get_graph_stats(graphiti):
    """获取图谱统计信息"""
    driver = graphiti.driver
    
    async with driver.session() as session:
        # 节点数量
        result = await session.run("MATCH (n) RETURN count(n) as count")
        node_count = (await result.single())["count"]
        
        # 边数量
        result = await session.run("MATCH ()-[r]->() RETURN count(r) as count")
        edge_count = (await result.single())["count"]
        
        # 片段数量
        result = await session.run(
            "MATCH (e:Episodic) RETURN count(e) as count"
        )
        episode_count = (await result.single())["count"]
        
        return {
            "nodes": node_count,
            "edges": edge_count,
            "episodes": episode_count
        }

6.6 生产环境检查清单

[ ] 启用 Neo4j 企业版（如需高可用）
[ ] 配置备份策略
[ ] 设置监控告警（节点/边数量、查询延迟）
[ ] 配置连接池
[ ] 实现熔断机制
[ ] 设置 API 密钥轮换
[ ] 配置日志聚合

---

七、Graphiti vs Zep Cloud

7.1 功能对比

特性	Graphiti（开源）	Zep Cloud（托管）
核心引擎	✅ 完整功能	✅ 完整功能
托管服务	❌ 自建	✅ 完全托管
自动扩展	❌ 手动配置	✅ 自动
多租户	基础实现	企业级隔离
SDK支持	Python	Python, TypeScript, Go
API 服务	MCP Server	RESTful API
成本控制	基础设施成本	按使用量付费
自定义程度	完全可控	有限定制
技术支持	社区支持	商业支持

7.2 选择建议

选择 Graphiti（开源）的场景：

1. 数据隐私要求高 - 需要完全控制数据存储位置 2. 已有基础设施 - 已部署 Neo4j 或 FalkorDB 3. 深度定制需求 - 需要修改核心逻辑或自定义实体类型 4. 成本敏感 - 希望控制 LLM 调用和数据存储成本 5. 技术团队充足 - 有能力维护图数据库和系统运维

选择 Zep Cloud 的场景：

1. 快速启动 - 希望立即使用，无需基础设施配置 2. 多语言 SDK - 需要 TypeScript 或 Go 支持 3. 自动扩展 - 业务量波动大，需要弹性伸缩 4. 团队资源有限 - 没有专职运维人员 5. 企业级 SLA - 需要商业支持和 SLA 保障

7.3 迁移路径

# 从 Graphiti 导出数据
async def export_graphiti_data(graphiti):
    """导出 Graphiti 数据为 JSON"""
    driver = graphiti.driver
    data = {"nodes": [], "edges": [], "episodes": []}
    
    async with driver.session() as session:
        # 导出节点
        result = await session.run("""
            MATCH (n:Entity) 
            RETURN n.uuid as uuid, n.name as name, 
                   n.summary as summary, labels(n) as labels
        """)
        async for record in result:
            data["nodes"].append(dict(record))
        
        # 导出边
        result = await session.run("""
            MATCH (s:Entity)-[r:RELATES_TO]->(t:Entity)
            RETURN r.uuid as uuid, s.uuid as source, t.uuid as target,
                   r.fact as fact, r.valid_at as valid_at, 
                   r.invalid_at as invalid_at
        """)
        async for record in result:
            data["edges"].append(dict(record))
    
    return data

# 导入到 Zep Cloud
async def import_to_zep_cloud(data, zep_client):
    """导入数据到 Zep Cloud"""
    for node in data["nodes"]:
        await zep_client.memory.add(
            session_id="migration",
            messages=[{
                "role": "system",
                "content": f"Entity: {node['name']}\nSummary: {node['summary']}"
            }]
        )

---

结语

Graphiti 代表了 AI 记忆系统的一种新范式——从静态文档检索转向动态时态图谱。它的核心创新在于：

1. 双时态模型 - 同时追踪事件发生时间和数据摄入时间 2. 增量构建 - 新数据实时集成，无需批量重计算 3. 混合检索 - 语义 + 关键词 + 图遍历的协同工作 4. 溯源追踪 - 每个事实都能追溯到原始数据源

对于正在构建 AI Agent 的开发者来说，Graphiti 提供了一个强大的基础架构，让 Agent 能够像人类一样拥有连续的记忆和时态推理能力。

---

参考资源

GitHub: https://github.com/getzep/graphiti
论文: arXiv:2501.13956
官方文档: https://docs.getzep.com/graphiti/
MCP Server: https://github.com/getzep/graphiti/tree/main/mcp_server
Discord 社区: https://discord.gg/getzep

---

*文章版本: 2025.04.06 | 作者: AI 助手 | 参考风格: Dan Abramov + Martin Fowler*

---

#Graphiti #Zep #知识图谱 #时态图谱 #AI记忆 #使用指南 #技术文档

Graphiti 深度使用指南：为 AI Agent 构建时态知识图谱

Graphiti 深度使用指南：为 AI Agent 构建时态知识图谱

目录

一、项目概览

1.1 什么是 Graphiti？

1.2 为什么需要时态知识图谱？

1.3 性能基准

二、核心概念详解

2.1 上下文图谱（Context Graph）

2.2 核心数据模型

2.3 架构概览

三、快速开始

3.1 环境准备

3.2 第一个示例

四、完整使用教程

4.1 初始化与连接

4.2 添加片段（Episodes）

4.3 检索方法详解

4.4 预定义搜索配方

4.5 自定义实体类型

4.6 高级功能

五、实战示例：对话记忆系统

5.1 系统架构

5.2 完整实现

5.3 运行效果

六、性能优化与最佳实践

6.1 索引优化

6.2 批量处理

6.3 嵌入模型选择

6.4 LLM 成本控制

6.5 监控与调试

6.6 生产环境检查清单

七、Graphiti vs Zep Cloud

7.1 功能对比

7.2 选择建议

7.3 迁移路径

结语

参考资源

🌟 智谱 GLM-5 已上线