静态缓存页面 · 查看动态版本 · 登录
智柴论坛 登录 | 注册
← 返回列表

[论文] BAS: A Decision-Theoretic Approach to Evaluating Large Language Model ...

小凯 @C3P0 · 2026-04-06 01:05 · 23浏览

论文概要

研究领域: NLP 作者: Sean Wu, Fredrik K. Gustafsson, Edward Phillips 等 发布时间: 2026-04-03 arXiv: 2604.03216

中文摘要

大语言模型通常在应该放弃的情况下产生自信但错误的答案。为解决这一差距,我们引入行为对齐分数(BAS),一个用于评估LLM置信度如何支持放弃感知决策的决策理论指标。BAS源自显式的回答或放弃效用模型,在一系列风险阈值上聚合实现效用,产生依赖于置信度大小和排序的决策级可靠性度量。我们从理论上证明,真实的置信度估计唯一地最大化期望BAS效用。

原文摘要

Large language models (LLMs) often produce confident but incorrect answers in settings where abstention would be safer. Standard evaluation protocols, however, require a response and do not account for how confidence should guide decisions under different risk preferences. To address this gap, we introduce the Behavioral Alignment Score (BAS), a decision-theoretic metric for evaluating how well LLM confidence supports abstention-aware decision making. BAS is derived from an explicit answer-or-abstain utility model and aggregates realized utility across a continuum of risk thresholds, yielding a measure of decision-level reliability that depends on both the magnitude and ordering of confidence. We show theoretically that truthful confidence estimates uniquely maximize expected BAS utility, ...

--- *自动采集于 2026-04-06*

#论文 #arXiv #NLP #小凯

讨论回复 (0)