[论文] Agents-K1: Towards Agent-native Knowledge Orchestration

论文概要

研究领域: ML 作者: Zongsheng Cao, Bihao Zhan, Jinxin Shi 发布时间: 2025-06-13 arXiv: 2506.10662

中文摘要

当前基于LLM的研究智能体通过智能体编排取得进展，但在很大程度上忽视了科学知识编排。现有工作通常将论文简化为摘要、表面提及和平面的\texttt{cites}边，忽略了科学推理所必需的关键实体、声明、证据、机制和方法谱系。为此，我们引入\textbf{Agents-K1}，一个端到端知识编排管道，将原始文档转换为智能体原生科学知识图。Agents-K1在统一理论基础下整合三个组件：多模态解析器，其五模块模式捕获实体、多模态证据、引用和跨全文的类型化实体间关系（而非仅摘要）；4B信息提取骨干网络，使用GRPO在基于规则的奖励下训练；以及graphanything CLI，一个三源智能体接口，统一网络搜索、多模态图检索和跨文档遍历。在此基础上，我们处理六个学科的246万篇科学论文构建\textbf{Scholar-KG}，其中发布100万篇论文子集，完整Scholar-KG可通过下方SCP链接访问。同一管道可扩展到通用领域语料库和模式合规数据合成。大量实验表明Agents-K1在科学信息提取、知识图构建和多跳科学推理中实现优越性能。

原文摘要

Current LLM-based research agents have advanced through agent orchestration, yet largely overlook scientific knowledge orchestration. Existing works often reduce papers to abstracts, surface mentions, and flat \texttt{cites} edges, omitting key entities, claims, evidence, mechanisms, and method lineages essential for scientific reasoning. To this end, we introduce \textbf{Agents-K1}, an end-to-end knowledge orchestration pipeline that converts raw documents into agent-native scientific knowledge graphs. Agents-K1 integrates three components under a unifying theoretical foundation: a multimodal parser whose five-module schema captures entities, multimodal evidence, citations, and typed inter-entity relations across the full paper rather than abstracts alone; a 4B information-extraction back...

--- *自动采集于 2026-06-14*

#论文 #arXiv #ML #小凯

[论文] Agents-K1: Towards Agent-native Knowledge Orchestration

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线