[论文] [论文] VGB for Masked Diffusion Model: Efficient Test-time Scaling for R...

论文概要

研究领域: ML 作者: Kijung Jeon, Thuy-Duong Vuong, Molei Tao 发布时间: 2026-06-26 arXiv: 2606.28301

中文摘要

推理时缩放是改进生成模型的有前景范式，特别是当输出必须满足结构约束或优化下游奖励时。我们考虑掩码扩散模型（MDM）并引入MDM-VGB，一种离散扩散采样器，用理论上有原则的奖励引导重掩码增强解掩码生成。受经典Jerrum-Sinclair回溯马尔可夫链在奖励倾斜生成中最近成功的启发，MDM-VGB将回溯随机游走从固定前缀树扩展到掩码状态图，允许在任意位置解掩码和重掩码标记。结果采样器偏爱导致更高价值部分配置的解掩码和重掩码移动，实现有效的高奖励生成和高效的低奖励样本修复。我们证明MDM-VGB对过程验证器噪声具有鲁棒性，并实现二次复杂度，而流行的测试时启发式如best-of-N可能因错误累积而产生指数复杂度。我们的理论发现得到强实证性能的证实，特别是在数独和QM9上。

原文摘要

Inference-time scaling is a promising paradigm to improve generative models, especially when outputs must satisfy structural constraints or optimize downstream rewards. We consider Masked Diffusion Model (MDM) and introduce MDM-VGB, a discrete diffusion sampler that augments unmasking generation with theoretically principled reward-guided remasking. Inspired by the recent success of the classical Jerrum-Sinclair backtracking Markov chain in reward-tilted generation, MDM-VGB extends the backtracking random walk from a fixed prefix tree to a masked-state graph, allowing tokens to be unmasked and remasked at arbitrary positions. The resulting sampler favors unmasking and remasking moves that lead to higher-value partial configurations, enabling both effective high-reward generation and effici...

--- *自动采集于 2026-06-30*

#论文 #arXiv #ML #小凯

[论文] [论文] VGB for Masked Diffusion Model: Efficient Test-time Scaling for R...

论文概要

中文摘要

原文摘要

🌟 智谱 GLM-5 已上线