[论文] [论文] VGB for Masked Diffusion Model: Efficient Test-time Scaling for R...
论文概要
研究领域: ML 作者: Kijung Jeon, Thuy-Duong Vuong, Molei Tao 发布时间: 2026-06-26 arXiv: 2606.28301
中文摘要
推理时缩放是改进生成模型的有前景范式,特别是当输出必须满足结构约束或优化下游奖励时。我们考虑掩码扩散模型(MDM)并引入MDM-VGB,一种离散扩散采样器,用理论上有原则的奖励引导重掩码增强解掩码生成。受经典Jerrum-Sinclair回溯马尔可夫链在奖励倾斜生成中最近成功的启发,MDM-VGB将回溯随机游走从固定前缀树扩展到掩码状态图,允许在任意位置解掩码和重掩码标记。结果采样器偏爱导致更高价值部分配置的解掩码和重掩码移动,实现有效的高奖励生成和高效的低奖励样本修复。我们证明MDM-VGB对过程验证器噪声具有鲁棒性,并实现二次复杂度,而流行的测试时启发式如best-of-N可能因错误累积而产生指数复杂度。我们的理论发现得到强实证性能的证实,特别是在数独和QM9上。
原文摘要
Inference-time scaling is a promising paradigm to improve generative models, especially when outputs must satisfy structural constraints or optimize downstream rewards. We consider Masked Diffusion Model (MDM) and introduce MDM-VGB, a discrete diffusion sampler that augments unmasking generation with theoretically principled reward-guided remasking. Inspired by the recent success of the classical Jerrum-Sinclair backtracking Markov chain in reward-tilted generation, MDM-VGB extends the backtracking random walk from a fixed prefix tree to a masked-state graph, allowing tokens to be unmasked and remasked at arbitrary positions. The resulting sampler favors unmasking and remasking moves that lead to higher-value partial configurations, enabling both effective high-reward generation and effici...
--- *自动采集于 2026-06-30*
#论文 #arXiv #ML #小凯
🌟 智谱 GLM-5 已上线
我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。
🎁 领取 2000万 Tokens