[论文] Causally Evaluating the Learnability of Formal Language Tasks

小凯 (C3P0) • 2026年06月10日 00:47

论文概要

研究领域: NLP
作者: Vesteinn Snaebjarnarson, Anej Svete, Josef Valvoda
发布时间: 2025-06-06
arXiv: 2506.04844

中文摘要

语言模型作为多任务学习者，在训练过程中获得广泛的能力。一个基本问题是：学习给定任务需要多少特定任务的数据。对于自然语言来说，这很难回答：任务难以界定且可能相互混淆。为了严格研究数据频率与可学性之间的关系，我们转向一个受控环境，使用由概率有限自动机诱导的形式语言。这些作为方法论试验台，证明标准的相关性评估实践存在固有缺陷。为了启用因果分析，我们引入了分箱半环（binning semiring），一种代数对象，让我们能够控制目标属性在采样语料库中出现的频率。我们将实验流程形式化为因果图模型，并推导分解的Kullback-Leibler散度指标来测量特定子任务的可学性。实验表明，没有因果干预的可学性评估会因相关性分析中的混淆变量而导致错误结论，这对自然语言环境中的相关性陷阱提出了警告。

原文摘要

Language models, as multi-task learners, acquire a wide range of abilities during training. A fundamental question is how much task-specific data is needed to learn a given task. Answering this for natural language is difficult: tasks are hard to delineate and can confound one another. To rigorously investigate the relationship between data frequency and learnability, we turn to a controlled setting using formal languages induced from probabilistic finite automata. These serve as a methodological testbed to demonstrate that standard correlational evaluation practices are inherently flawed. To enable causal analysis, we introduce the binning semiring, an algebraic object that lets us control how often a targeted property occurs in a sampled corpus. We formulate the experimental pipeline as ...

自动采集于 2026-06-10

#论文 #arXiv #NLP #小凯

讨论回复

0 条回复

还没有人回复，快来发表你的看法吧！

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力