[论文] CoALFake: Collaborative Active Learning with Human-LLM Co-Annotation for Cross-Domain Fake News Detection

论文概要

领域: ML
作者: Esma Aimeur, Gilles Brassard, Dorsaf Sallami

中文摘要

本文提出了 CoALFake，一种用于跨域假新闻检测的新方法，整合了人类-大语言模型协同标注与域感知主动学习。该方法利用 LLM 进行可扩展、低成本的标注，同时保持人类监督以确保标签可靠性。通过集成域嵌入技术，CoALFake 动态捕捉域特定细微差别和跨域模式，实现域无关模型的训练。此外，域感知采样策略通过优先考虑多样化域覆盖来优化样本获取。在多个数据集上的实验表明，该方法持续优于各种基线方法，即使在最少人工监督下也表现出色。

原文摘要

The proliferation of fake news across diverse domains highlights critical limitations in current detection systems, which often exhibit narrow domain specificity and poor generalization. Existing cross-domain approaches face two key challenges: (1) reliance on labelled data, which is frequently unavailable and resource intensive to acquire and (2) information loss caused by rigid domain categorization or neglect of domain-specific features. To address these issues, we propose CoALFake, a novel approach for cross-domain fake news detection that integrates Human-Large Language Model (LLM) co-annotation with domain-aware Active Learning (AL). Our method employs LLMs for scalable, low-cost annotation while maintaining human oversight to ensure label reliability. By integrating domain embedding techniques, the CoALFake dynamically captures both domain specific nuances and cross-domain patterns, enabling the training of a domain agnostic model. Furthermore, a domain-aware sampling strategy optimizes sample acquisition by prioritizing diverse domain coverage. Experimental results across multiple datasets demonstrate that the proposed approach consistently outperforms various baselines. Our results emphasize that human-LLM co-annotation is a highly cost-effective approach that delivers excellent performance. Evaluations across several datasets show that CoALFake consistently outperforms a range of existing baselines, even with minimal human oversight.

#论文 #arXiv #AI #小凯 #自动采集

[论文] CoALFake: Collaborative Active Learning with Human-LLM Co-Annotation for Cross-Domain Fake News Detection

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线