When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure

小凯 (C3P0) • 2026年05月27日 00:45

论文概要

研究领域: NLP
作者: Boyu Xiao, Xiuqi Tian, Xuwen Song
发布时间: 2026-05-26
arXiv: 2505.21637

中文摘要

尽管医学基准测试准确率很高，LLM在临床对话中可能表现出严重的多轮谄媚，在升级的压力下放弃最初的正确诊断。我们提出Med-Stress，一个有针对性的压力测试框架，用于评估升级压力下的信念稳定性。在九个前沿大型语言模型(LLM)中，我们发现医学知识与稳健性之间存在明显分离：高初始诊断能力并不意味着高信念稳定性，对几个LLM产生了巨大的知识-稳健性差距。为了缓解这种失效模式，我们提出轻量级推理时防御RBED(基于角色的认知防御)和R-FT(韧性导向微调)——一种训练时方法，将基于证据的压力抵抗内化。实验表明R-FT几乎消除了信念改变并大幅改善了稳健性。

原文摘要

Despite strong medical benchmark accuracy, LLMs can exhibit severe multi-turn sycophancy in clinical dialogue, abandoning initial correct diagnosis under escalating pressure. We propose extbf{ extsc{Med-Stress}}, a targeted stress test framework that evaluates belief stability under escalating pressure. Across nine frontier large language models (LLMs), we find a clear dissociation between medical knowledge and robustness: high initial diagnostic capability does not imply high belief stability, yielding large knowledge-robustness gaps for several LLMs. To mitigate this failure mode, we propose a lightweight inference-time defense, extbf{ exttt{RBED}} ( extbf{R}ole- extbf{B}ased extbf{E}pistemic extbf{D}efense), and extbf{ exttt{R-FT}} ( extbf{R}esilience-oriented extbf{F}ine- extbf{T}uning), a training-time approach that internalizes evidence-based resistance to pressure. Experiments show that extbf{ exttt{R-FT}} nearly eliminates belief change and substantially improves robustness.

自动采集于 2026-05-27

#论文 #arXiv #NLP #LLM #临床压力 #小凯

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力