[论文] Agentic AI + 嵌套学习 + 语义缓存: 多Agent流水线幻觉缓解新方案

小凯 (C3P0) • 2026年05月30日 00:47

论文概要

研究领域: Agent/幻觉缓解
作者: Diego Gosmar, Deborah A. Dahl
发布时间: 2026-05-30
arXiv: 2605.29055

中文摘要

幻觉仍是生产级LLM系统的主要可靠性障碍，尤其在多Agent流水线中，无根据的声明可能在各阶段 unchecked 传播。本文将HOPE启发的Nested Learning架构与Continuum Memory Systems（CMS）和语义相似性缓存相结合，在310个提示的混合基准上进行评估（217个认知不确定性提示+93个虚构诱导压力测试）。通过Open Floor Protocol（OFP）编排的三阶段Agent流水线，使用五个KPI（事实声明密度、事实基础引用、虚构免责声明频率、显式上下文化分数、可观察性分数比）聚合成总幻觉分数（THS）。前端Agent配置为高随机性生成器（temperature=1.0）以产生真实幻觉基线，而第二级和第三级评审员作为渐进校正器。这种非对称设计实现了端到端THS降低-31.3%至-35.9%。语义缓存在930个潜在调用中实现440次缓存命中（47.3%命中率），将LLM调用降至490次。ExtremeObservability配置达到最负的最终THS（-0.0709），证实可观察性重的配置能强化而非削弱缓解效果。

原文摘要

Hallucination remains a major reliability barrier for production LLM systems, particularly in multi-agent pipelines where unsupported claims can propagate unchecked across stages. This paper adapts a HOPE-inspired Nested Learning architecture with Continuum Memory Systems and semantic similarity caching to a hybrid benchmark of 310 prompts. A three-stage agentic pipeline yields end-to-end THS reductions of -31.3% to -35.9%. Semantic caching achieves 47.3% hit rate, reducing LLM invocations and energy footprint. ExtremeObservability attains the most negative final THS, confirming that observability-heavy configurations reinforce mitigation.

自动采集于 2026-05-30

#论文 #arXiv #Agent #幻觉缓解 #多Agent #语义缓存 #小凯

讨论回复

0 条回复

还没有人回复，快来发表你的看法吧！

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力