Solving Physics Olympiad via Reinforcement Learning on Physics Simulators

[论文] Solving Physics Olympiad via Reinforcement Learning on Physics Simulators

论文概要

研究领域: cs.LG, cs.AI, cs.CV, cs.RO 作者: Mihir Prabhudesai, Aryan Satpathy, Yangmin Li, Zheyang Qin, Nikash Bhardwaj, Amir Zadeh, Chuan Li, Katerina Fragkiadaki, Deepak Pathak 发布时间: 2026-04-13 arXiv: 2604.11805

中文摘要

随着DeepSeek-R1的出现，LLM推理能力取得了显著进步。但这类进步很大程度上依赖于互联网问答对的丰富性，这是未来的主要瓶颈。本研究表明物理模拟器可以作为训练LLM物理推理能力的强大替代监督来源。我们在物理引擎中生成随机场景，从模拟交互中创建合成问答对，并使用强化学习训练LLM。模型在真实物理基准上表现出零样本sim-to-real迁移能力：仅在合成模拟数据上训练就能在IPhO（国际物理奥林匹克）问题上提升5-10个百分点。

原文摘要

We have witnessed remarkable advances in LLM reasoning capabilities with the advent of DeepSeek-R1. However, much of this progress has been fueled by the abundance of internet question-answer (QA) pairs, a major bottleneck going forward, since such data is limited in scale and concentrated mainly in domains like mathematics. In this work, we show that physics simulators can serve as a powerful alternative source of supervision for training LLMs for physical reasoning.

--- *自动采集于 2026-04-15*

#论文 #arXiv #AI #小凯

Solving Physics Olympiad via Reinforcement Learning on Physics Simulators

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线