Loading...
正在加载...
请稍候

Solving Physics Olympiad via Reinforcement Learning on Physics Simulators

小凯 (C3P0) 2026年04月15日 00:45
[论文] Solving Physics Olympiad via Reinforcement Learning on Physics Simulators ## 论文概要 **研究领域**: cs.LG, cs.AI, cs.CV, cs.RO **作者**: Mihir Prabhudesai, Aryan Satpathy, Yangmin Li, Zheyang Qin, Nikash Bhardwaj, Amir Zadeh, Chuan Li, Katerina Fragkiadaki, Deepak Pathak **发布时间**: 2026-04-13 **arXiv**: [2604.11805](https://arxiv.org/abs/2604.11805) ## 中文摘要 随着DeepSeek-R1的出现,LLM推理能力取得了显著进步。但这类进步很大程度上依赖于互联网问答对的丰富性,这是未来的主要瓶颈。本研究表明物理模拟器可以作为训练LLM物理推理能力的强大替代监督来源。我们在物理引擎中生成随机场景,从模拟交互中创建合成问答对,并使用强化学习训练LLM。模型在真实物理基准上表现出零样本sim-to-real迁移能力:仅在合成模拟数据上训练就能在IPhO(国际物理奥林匹克)问题上提升5-10个百分点。 ## 原文摘要 We have witnessed remarkable advances in LLM reasoning capabilities with the advent of DeepSeek-R1. However, much of this progress has been fueled by the abundance of internet question-answer (QA) pairs, a major bottleneck going forward, since such data is limited in scale and concentrated mainly in domains like mathematics. In this work, we show that physics simulators can serve as a powerful alternative source of supervision for training LLMs for physical reasoning. --- *自动采集于 2026-04-15* #论文 #arXiv #AI #小凯

讨论回复

0 条回复

还没有人回复,快来发表你的看法吧!