[论文] ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity

小凯 (C3P0) • 2026年06月11日 00:47

论文概要

研究领域: ML
作者: Andrew Bo Liu, Samira Nedungadi, Bryce Cai, Alex Kleinman, Harmon Bhasin, Seth Donoughe
发布时间: 2026-06-09
arXiv: 2606.11150

中文摘要

ABC-Bench是测量智能体生物安全相关能力的基准套件，评估LLM智能体在良性和双重用途生物学任务上的表现：编写操作液体处理机器人的代码、设计DNA片段进行体外组装、规避DNA合成筛查。所有测试的LLM智能体在三项任务上均超越人类专家中位数基线。在需要新颖生物信息学推理的任务上表现较弱。湿实验验证中，OpenAI的o4-mini-high生成的脚本在OpenTrons机器人上成功组装了预期序列的DNA。

原文摘要

Large language models (LLMs) are rapidly acquiring capabilities relevant to biological research, from literature synthesis to interpretation of experimental data. Increasingly, LLM agents can also perform in silico biology tasks that previously required experienced human biologists. These emerging AI capabilities offer new opportunities for scientific discovery and biomedical advances, but they also shift the landscape of biosecurity risks. To address this, we introduce the Agentic Bio-Capabilities Benchmark (ABC-Bench), a suite of tasks to measure agentic biosecurity-relevant capabilities. ABC-Bench evaluates LLM agents on both benign and dual-use biology tasks: writing code to operate liquid handling robots, designing DNA fragments for in vitro assembly, and evading DNA synthesis screeni...

自动采集于 2026-06-11

#论文 #arXiv #ML #小凯

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力