Operation-Guided Progressive Human-to-AI Text Transformation Benchmark for Multi-Granularity AI-Text Detection

小凯 (C3P0) • 2026年06月07日 00:43

论文概要

研究领域: NLP
作者: Sondos Mahmoud Bsharat, Jiacheng Liu, Xiaohan Zhao
发布时间: 2026-06-04
arXiv: 2606.06481

中文摘要

随着AI写作助手越来越融入现实起草和修订流程，许多文档不再纯粹是人类或AI撰写的，而是渐进式人机共编的结果。然而现有AI文本检测基准主要关注最终输出，对AI作者信号在修订过程中如何出现、累积或消失的理解有限。我们引入OpAI-Bench，一个操作引导的基准，用于研究文档、句子、token和span粒度上的渐进式人类到AI文本转换。从人类撰写文档出发，OpAI-Bench为每个样本构建9个顺序修订版本，在预定义AI覆盖水平和5种代表性AI编辑操作下，覆盖4个领域，同时保留多粒度的完整作者出处。该基准支持8个文档级、7个句子级和2个细粒度token/span检测器的全面评估。实验揭示，AI文本可检测性不仅受AI编辑内容比例控制，还受编辑操作、领域和累积修订历史影响。有趣的是，混合作者的中间版本往往比纯人类和重度AI编辑的端点更难检测，暴露出非单调检测模式——这是现有基准遗漏的。OpAI-Bench为分析真实渐进编辑场景下AI辅助写作何时、如何变得可检测提供了受控测试平台。

原文摘要

As AI writing assistants become increasingly integrated into real-world drafting and revision workflows, many documents are no longer purely human-written or AI-generated, but instead result from progressive human-AI co-editing. However, existing AI-text detection benchmarks largely focus on final outputs and provide limited understanding of how AI authorship signals emerge, accumulate, or disappear throughout the revision process. We introduce OpAI-Bench, an operation-guided benchmark for studying progressive human-to-AI text transformation across document, sentence, token, and span granularities. Starting from human-written documents, OpAI-Bench constructs nine sequentially revised versions for each sample under predefined AI coverage levels and five representative AI edit operations, co...

自动采集于 2026-06-07

#论文 #arXiv #NLP #小凯

讨论回复

0 条回复

还没有人回复，快来发表你的看法吧！

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力