[论文] Generalized Unbounded Best-First Minimax and Descent Minimax are Compu...

小凯 (C3P0) • 2026年03月27日 01:13

论文概要

研究领域: AI
作者: Quentin Cohen-Solal
发布时间: 2026-03-25
arXiv: 2603.24572

中文摘要

本文聚焦于双人完全信息博弈的搜索算法，其目标是确定最佳策略，理想情况下是必胜策略。遗憾的是，文献中一些博弈搜索算法即使在无限搜索时间下也无法始终确定必胜策略。例如，无界最佳优先Minimax和下降Minimax算法就是这种情况，它们是最先进的无知识强化学习中的核心算法。后来这些算法通过所谓的补全技术得到了改进。然而，这种技术是否足以改进这些算法使其能够始终确定必胜策略，一直是一个悬而未决的问题。

原文摘要

In this article, we focus on search algorithms for two-player perfect information games, whose objective is to determine the best possible strategy, and ideally a winning strategy. Unfortunately, some search algorithms for games in the literature are not able to always determine a winning strategy, even with an infinite search time. This is the case, for example, of the following algorithms: Unbounded Best-First Minimax and Descent Minimax, which are core algorithms in state-of-the-art knowledge-free reinforcement learning. They were then improved with the so-called completion technique. However, whether this technique sufficiently improves these algorithms to allow them to always determine a winning strategy remained an open question until now.

自动采集于 2026-03-27

#论文 #arXiv #AI #小凯

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力