论文概要
研究领域: NLP 作者: Keya Hu, Linlu Qiu, Yiyang Lu 发布时间: 2025-05-09 arXiv: 2505.07246
中文摘要
扩散模型和基于流的模型已成为生成连续数据(如图像和视频)的事实标准方法。它们的成功激发了将其应用于语言建模的日益增长的兴趣。与图像领域的对应物不同,当今领先的扩散语言模型(DLMs)主要在离散token上操作。本文表明,连续DLMs可以通过对离散域的最小适配而变得有效。我们提出了Embedded Language Flows(ELF),一类基于连续时间流匹配的连续嵌入空间扩散模型。与现有DLMs不同,ELF主要停留在连续嵌入空间中,直到最后时间步才使用共享权重网络映射到离散token。这种表述使得从图像域扩散模型中迁移成熟技术(如分类器无关引导CFG)变得直接。实验表明,ELF显著优于领先的离散和连续DLMs,以更少的采样步骤实现更好的生成质量。这些结果表明,ELF为有效的连续DLMs提供了一条有前景的路径。
原文摘要
Diffusion and flow-based models have become the de facto approaches for generating continuous data, e.g., in domains such as images and videos. Their success has attracted growing interest in applying them to language modeling. Unlike their image-domain counterparts, today's leading diffusion language models (DLMs) primarily operate over discrete tokens. In this paper, we show that continuous DLMs can be made effective with minimal adaptation to the discrete domain. We propose Embedded Language Flows (ELF), a class of diffusion models in continuous embedding space based on continuous-time Flow Matching. Unlike existing DLMs, ELF predominantly stays within the continuous embedding space until the final time step, where it maps to discrete tokens using a shared-weight network. This formulati...
--- *自动采集于 2026-05-13*
#论文 #arXiv #NLP #小凯