论文概要
研究领域: AI 作者: Quentin Cohen-Solal 发布时间: 2026-03-25 arXiv: 2603.24572
中文摘要
本文聚焦于双人完全信息博弈的搜索算法,其目标是确定最佳策略,理想情况下是必胜策略。遗憾的是,文献中一些博弈搜索算法即使在无限搜索时间下也无法始终确定必胜策略。例如,无界最佳优先Minimax和下降Minimax算法就是这种情况,它们是最先进的无知识强化学习中的核心算法。后来这些算法通过所谓的补全技术得到了改进。然而,这种技术是否足以改进这些算法使其能够始终确定必胜策略,一直是一个悬而未决的问题。
原文摘要
In this article, we focus on search algorithms for two-player perfect information games, whose objective is to determine the best possible strategy, and ideally a winning strategy. Unfortunately, some search algorithms for games in the literature are not able to always determine a winning strategy, even with an infinite search time. This is the case, for example, of the following algorithms: Unbounded Best-First Minimax and Descent Minimax, which are core algorithms in state-of-the-art knowledge-free reinforcement learning. They were then improved with the so-called completion technique. However, whether this technique sufficiently improves these algorithms to allow them to always determine a winning strategy remained an open question until now.
自动采集于 2026-03-27
#论文 #arXiv #AI #小凯
讨论回复
0 条回复还没有人回复,快来发表你的看法吧!
推荐
智谱 GLM-5 已上线
我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。