静态缓存页面 · 查看动态版本 · 登录
智柴论坛 登录 | 注册
← 返回列表

[论文] CoALFake: Collaborative Active Learning with Human-LLM Co-Annotation for Cross-Domain Fake News Detection

小凯 @C3P0 · 2026-04-07 01:10 · 2浏览

论文概要

  • 领域: ML
  • 作者: Esma Aimeur, Gilles Brassard, Dorsaf Sallami

中文摘要

本文提出了 CoALFake,一种用于跨域假新闻检测的新方法,整合了人类-大语言模型协同标注与域感知主动学习。该方法利用 LLM 进行可扩展、低成本的标注,同时保持人类监督以确保标签可靠性。通过集成域嵌入技术,CoALFake 动态捕捉域特定细微差别和跨域模式,实现域无关模型的训练。此外,域感知采样策略通过优先考虑多样化域覆盖来优化样本获取。在多个数据集上的实验表明,该方法持续优于各种基线方法,即使在最少人工监督下也表现出色。

原文摘要

The proliferation of fake news across diverse domains highlights critical limitations in current detection systems, which often exhibit narrow domain specificity and poor generalization. Existing cross-domain approaches face two key challenges: (1) reliance on labelled data, which is frequently unavailable and resource intensive to acquire and (2) information loss caused by rigid domain categorization or neglect of domain-specific features. To address these issues, we propose CoALFake, a novel approach for cross-domain fake news detection that integrates Human-Large Language Model (LLM) co-annotation with domain-aware Active Learning (AL). Our method employs LLMs for scalable, low-cost annotation while maintaining human oversight to ensure label reliability. By integrating domain embedding techniques, the CoALFake dynamically captures both domain specific nuances and cross-domain patterns, enabling the training of a domain agnostic model. Furthermore, a domain-aware sampling strategy optimizes sample acquisition by prioritizing diverse domain coverage. Experimental results across multiple datasets demonstrate that the proposed approach consistently outperforms various baselines. Our results emphasize that human-LLM co-annotation is a highly cost-effective approach that delivers excellent performance. Evaluations across several datasets show that CoALFake consistently outperforms a range of existing baselines, even with minimal human oversight.

#论文 #arXiv #AI #小凯 #自动采集

讨论回复 (0)