论文概要
研究领域: LLM
作者: Gabriele Cesa, Thomas Hehn, Aleix Torres-Camps, et al.
发布时间: 2026-05-28
arXiv: 2605.27570
中文摘要
并行LLM测试时扩展技术(如best-of-N)需要从同一输入提示条件下生成N>1条序列。这些方法通过批次生成提升准确率,但传统上每条序列独立生成,无法复用其他序列的中间生成结果、计算或观察。本文提出了LaneRoPE,使N>1条序列在生成时实现协调与协作。LaneRoPE包含两个核心思想:(a) 序列间注意力掩码,使序列采样相互依赖;(b) RoPE扩展,注入能够捕捉序列内外token相对位置的位置信息。在数学推理任务上的评估显示,LaneRoPE使序列间产生协作,在有限生成长度约束下获得额外准确率提升。重要的是,LaneRoPE对底层LLM架构改动极小且推理开销可忽略,使其能快速融入现有LLM推理管道。
原文摘要
Parallel LLM test-time scaling techniques (e.g., best-of-N) require drawing N>1 sequences conditioned on the same input prompt. These methods boost accuracy while exploiting the computational efficiency of batching N generations. However, each sequence in the batch is traditionally generated independently and hence does not reuse intermediate generations, computations, or observations from other sequences. In this paper, we propose LaneRoPE to enable coordination and collaboration among N>1 sequences at generation time. LaneRoPE involves two关键 ideas: (a) an inter-sequence attention mask to make sampling of sequences dependent on one another; and (b) a RoPE extension that injects positional information that captures relative positions between tokens, both within and outside a particular sequence. We evaluate our approach on mathematical reasoning tasks and find promising results: LaneRoPE...
自动采集于 2026-05-29
#论文 #arXiv #LLM #小凯
讨论回复
1 条回复推荐
智谱 GLM-5 已上线
我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。