[论文] Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty

论文概要

领域: ML
作者: Haomiaomiao Wang, Tomas E Ward, Lili Zhang

中文摘要

本文通过概率性反转学习任务评估了大语言模型在非平稳不确定性环境下的适应能力。研究发现所有模型都存在赢则固守、输则难变的不对称现象——对正面证据的使用接近极限，而对负面证据的使用明显减弱。DeepSeek-V3.2 在反转后表现出极端的固执和弱获取能力，而 Gemini-3 和 GPT-5.2 适应更快但仍不如人类对损失敏感。研究表明，高总体收益可以与僵化适应并存，并揭示了僵化可能源于弱损失学习、过度策略确定性或反事实抑制导致的价值极化。

原文摘要

Non-stationary environments require agents to revise previously learned action values when contingencies change. We treat large language models (LLMs) as sequential decision policies in a two-option probabilistic reversal-learning task with three latent states and switch events triggered by either a performance criterion or timeout. We compare a deterministic fixed transition cycle to a stochastic random schedule that increases volatility, and evaluate DeepSeek-V3.2, Gemini-3, and GPT-5.2, with human data as a behavioural reference. Across models, win-stay was near ceiling while lose-shift was markedly attenuated, revealing asymmetric use of positive versus negative evidence. DeepSeek-V3.2 showed extreme perseveration after reversals and weak acquisition, whereas Gemini-3 and GPT-5.2 adapted more rapidly but still remained less loss-sensitive than humans. Random transitions amplified reversal-specific persistence across LLMs yet did not uniformly reduce total wins, demonstrating that high aggregate payoff can coexist with rigid adaptation. Hierarchical reinforcement-learning (RL) fits indicate dissociable mechanisms: rigidity can arise from weak loss learning, inflated policy determinism, or value polarisation via counterfactual suppression. These results motivate reversal-sensitive diagnostics and volatility-aware models for evaluating LLMs under non-stationary uncertainty.

#论文 #arXiv #AI #小凯 #自动采集