[论文] CoALFake: Collaborative Active Learning with Human-LLM Co-Annotation for Cross-Domain Fake News Detection

论文概要

研究领域: ML 作者: Esma Aïmeur, Gilles Brassard, Dorsaf Sallami

中文摘要

假新闻在各行各业的泛滥突显了当前检测系统的关键局限，这些系统往往表现出狭窄的领域特异性和较差的泛化能力。现有的跨域方法面临两个关键挑战：(1)依赖标注数据，这些数据通常不可用且获取资源密集；(2)由刚性域分类或忽视域特定特征导致的信息丢失。为解决这些问题，我们提出CoALFake，这是一种用于跨域假新闻检测的新方法，将人类-大语言模型(LLM)共同标注与域感知主动学习(AL)相结合。我们的方法使用LLM进行可扩展、低成本的标注，同时保持人类监督以确保标签可靠性。通过整合域嵌入技术，CoALFake动态捕获域特定细微差别和跨域模式，实现对域无关模型的训练。此外，域感知采样策略通过优先考虑多样化域覆盖来优化样本获取。跨多个数据集的实验结果表明，所提出的方法始终优于各种基线。我们的结果强调，人类-LLM共同标注是一种极具成本效益的方法，可提供出色的性能。在多个数据集上的评估表明，即使在最少的人类监督下，CoALFake也始终优于一系列现有基线。

原文摘要

The proliferation of fake news across diverse domains highlights critical limitations in current detection systems, which often exhibit narrow domain specificity and poor generalization. Existing cross-domain approaches face two key challenges: (1) reliance on labelled data, which is frequently unavailable and resource intensive to acquire and (2) information loss caused by rigid domain categorization or neglect of domain-specific features. To address these issues, we propose CoALFake, a novel approach for cross-domain fake news detection that integrates Human-Large Language Model (LLM) co-annotation with domain-aware Active Learning (AL). Our method employs LLMs for scalable, low-cost annotation while maintaining human oversight to ensure label reliability. By integrating domain embedding techniques, the CoALFake dynamically captures both domain specific nuances and cross-domain patterns, enabling the training of a domain agnostic model. Furthermore, a domain-aware sampling strategy optimizes sample acquisition by prioritizing diverse domain coverage. Experimental results across multiple datasets demonstrate that the proposed approach consistently outperforms various baselines. Our results emphasize that human-LLM co-annotation is a highly cost-effective approach that delivers excellent performance. Evaluations across several datasets show that CoALFake consistently outperforms a range of existing baselines, even with minimal human oversight.

--- *自动采集于 2026-04-07*

#论文 #arXiv #AI #小凯 #自动采集

[论文] CoALFake: Collaborative Active Learning with Human-LLM Co-Annotation for Cross-Domain Fake News Detection

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线