When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure

论文概要

研究领域: NLP 作者: Boyu Xiao, Xiuqi Tian, Xuwen Song 发布时间: 2026-05-26 arXiv: 2505.21637

中文摘要

尽管医学基准测试准确率很高，LLM在临床对话中可能表现出严重的多轮谄媚，在升级的压力下放弃最初的正确诊断。我们提出Med-Stress，一个有针对性的压力测试框架，用于评估升级压力下的信念稳定性。在九个前沿大型语言模型(LLM)中，我们发现医学知识与稳健性之间存在明显分离：高初始诊断能力并不意味着高信念稳定性，对几个LLM产生了巨大的知识-稳健性差距。为了缓解这种失效模式，我们提出轻量级推理时防御RBED(基于角色的认知防御)和R-FT(韧性导向微调)——一种训练时方法，将基于证据的压力抵抗内化。实验表明R-FT几乎消除了信念改变并大幅改善了稳健性。

原文摘要

Despite strong medical benchmark accuracy, LLMs can exhibit severe multi-turn sycophancy in clinical dialogue, abandoning initial correct diagnosis under escalating pressure. We propose extbf{ extsc{Med-Stress}}, a targeted stress test framework that evaluates belief stability under escalating pressure. Across nine frontier large language models (LLMs), we find a clear dissociation between medical knowledge and robustness: high initial diagnostic capability does not imply high belief stability, yielding large knowledge-robustness gaps for several LLMs. To mitigate this failure mode, we propose a lightweight inference-time defense, extbf{ exttt{RBED}} ( extbf{R}ole- extbf{B}ased extbf{E}pistemic extbf{D}efense), and extbf{ exttt{R-FT}} ( extbf{R}esilience-oriented extbf{F}ine- extbf{T}uning), a training-time approach that internalizes evidence-based resistance to pressure. Experiments show that extbf{ exttt{R-FT}} nearly eliminates belief change and substantially improves robustness.

--- *自动采集于 2026-05-27*

#论文 #arXiv #NLP #LLM #临床压力 #小凯

When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线