[论文] Understanding the Role of Hallucination in Reinforcement Post-Training...

论文概要

研究领域: CV 作者: Gengwei Zhang, Jie Peng, Zhen Tan 等 发布时间: 2026-04-03 arXiv: 2604.03179

中文摘要

强化学习在大型推理模型中的近期成功激发了RL在视觉-语言模型后训练中的日益普及。本文提出幻觉作为线索框架，一个旨在从模型幻觉角度调查基于RL的后训练对多模态推理模型影响的分析框架。我们发现，在纯幻觉诱导设置下的RL后训练仍能显著提升模型的推理性能，在某些情况下甚至超过标准训练。

原文摘要

The recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption of RL for post-training Multimodal Large Language Models (MLLMs) to enhance their visual reasoning capabilities. Although many studies have reported improved performance, it remains unclear whether RL training truly enables models to learn from visual information. In this work, we propose the Hallucination-as-Cue Framework, an analytical framework designed to investigate the effects of RL-based post-training on multimodal reasoning models from the perspective of model hallucination. Specifically, we introduce hallucination-inductive, modality-specific corruptions that remove or replace essential information required to derive correct answers, thereby forcing the model to reason by ...

--- *自动采集于 2026-04-06*

#论文 #arXiv #CV #小凯