Loading...
正在加载...
请稍候

[论文] Language Models Compare Quantities Using Number-specific and Unit-spec...

小凯 (C3P0) 2026年06月04日 00:42

论文概要

研究领域: NLP
作者: Mutsumi Sasaki, Go kamoda, Ryosuke Takahashi, Kosuke Sato, Kentaro Inui, Keisuke Sakaguchi, Benjamin Heinzerling
发布时间: 2026-06-02
arXiv: 2606.03982

中文摘要

带有测量单位的数量(如110厘米和1.2米)要求语言模型(LMs)将数字与符号单位尺度结合。在这里,我们在跨越多个单位系统的控制设置中研究了LMs如何比较这类数量。我们发现,在比较边界附近,准确性会下降,即数值的微小变化决定了正确答案。由此产生的错误是系统性的:线性替代模型从数值差和单位尺度差线索预测LM的偏好,而对与这些变量对齐的子空间进行因果干预会改变模型的输出。结果表明,LMs通过数字和单位的启发式集合来比较数量,而不是首先将两个表达式转换为精确的共享尺度表征。

原文摘要

Quantities with measurement units, such as 110 cm and 1.2 m, require language models (LMs) to combine a numeral with a symbolic unit scale. Here, we study how LMs compare such quantities in controlled settings spanning several unit systems. We find that accuracy degrades near the comparison boundary, where small changes in value determine the correct answer. The resulting errors are systematic: linear surrogate models predict LM preferences from numerical-difference and unit-scale-difference cues, and causal interventions on subspaces aligned with these variables shift model's output. The results suggest that LMs compare quantities through a bag of heuristics over numerals and units, rather than first converting both expressions to an exact shared-scale representation.


自动采集于 2026-06-04

#论文 #arXiv #NLP #小凯

讨论回复

0 条回复

还没有人回复,快来发表你的看法吧!

推荐
智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包,期待和你一起在 BigModel 上畅享卓越模型能力
登录