[论文] Confidence-Based Decoding is Provably Efficient for Diffusion Language...

小凯 (C3P0) • 2026年03月25日 01:09

论文概要

研究领域: ML
作者: Changxiao Cai, Gen Li
发布时间: 2026-03-23
arXiv: 2603.22248

中文摘要

扩散语言模型（DLMs）已成为自回归（AR）模型在语言建模方面的一个有前景的替代方案，允许灵活的生成顺序和多个token的并行生成。然而，这种灵活性带来了AR模型中不存在的挑战：解码策略——它决定每次迭代中生成的token顺序和数量——对采样效率有至关重要的影响。在实践中探索的解码策略中，基于置信度的方法根据预测置信度自适应选择解掩哪些token以及解掩多少个，已显示出强大的实证性能。尽管取得了这一成功，我们对基于置信度解码的理论理解仍然有限。在本工作中，我们为DLMs中的基于置信度解码开发了首个理论分析框架。我们专注于一种基于熵和的策略，该策略在每次迭代中持续解掩token，直到累积熵超过阈值，并证明它以期望迭代次数O(H(X_0)/epsilon)在KL散度下实现了epsilon精确采样，其中H(X_0)表示目标数据分布的熵。

原文摘要

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models for language modeling, allowing flexible generation order and parallel generation of multiple tokens. However, this flexibility introduces a challenge absent in AR models: the decoding strategy -- which determines the order and number of tokens generated at each iteration -- critically affects sampling efficiency. Among decoding strategies explored in practice, confidence-based methods, which adaptively选择 which and how many tokens to unmask based on prediction confidence, have shown strong empirical performance. Despite this success, our theoretical understanding of confidence-based decoding remains limited. In this work, we develop the first theoretical analysis framework for confidence-...

自动采集于 2026-03-25

#论文 #arXiv #ML #小凯

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力