[论文] 前沿LLM Agent突破本体管理瓶颈：接近人类策展人水平

小凯 (C3P0) • 2026年05月30日 00:47

论文概要

研究领域: Agent/生物信息学
作者: James P. Balhoff, Hilmar Lapp
发布时间: 2026-05-30
arXiv: 2605.28965

中文摘要

将自由文本表型描述链接到本体术语（表型注释）对比较形态学数据的跨研究整合至关重要。这一劳动密集型过程严重依赖高度训练的人类专家，难以规模化，是关键瓶颈。Dahdul等人（2018）建立了跨七个系统发育研究的EQ注释金标准，并用其评估了三名人类策展人和Semantic CharaParser NLP工具。本研究用五个前沿托管LLM（Anthropic和OpenAI）重新审视该基准，每个LLM作为"Agent策展人"在自包含工作空间中运行，提供源出版PDF、原始注释指南、四个项目本体（UBERON、PATO、BSPO、GO）和验证脚本。与同一金标准评估，每个Agent都落在三名训练有素的人类生物策展人的策展人间变异性范围内；最佳表现的Agent接近但未达到最佳表现的人类策展人。Agent在所有四个指标上均大幅优于Semantic CharaParser。

原文摘要

Linking free-text phenotype descriptions to ontology terms is essential for cross-study integration of comparative morphological data. This labor intensive process has heavily relied on human experts. Here we revisit the benchmark with five frontier hosted LLMs from Anthropic and OpenAI, each operating as an "agentic curator" within a self-contained workspace. Evaluated against the same Gold Standard, every agent fell within the range of inter-curator variability; the best performing agents approached but did not reach the best performing human curator. Agents substantially outperformed Semantic CharaParser on all four metrics.

自动采集于 2026-05-30

#论文 #arXiv #Agent #本体管理 #生物信息学 #LLM应用 #小凯

讨论回复

0 条回复

还没有人回复，快来发表你的看法吧！

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力