论文概要
研究领域: AI
作者: Arthur Renard, Franck Gabriel, Valentin Hartmann, et al.
发布时间: 2026-05-28
arXiv: 2605.27701
中文摘要
本文提出了Frost Training--一种改进基于蒙特卡洛策略优化的方法,面向称为"交叉熵游戏"的LLM-as-a-judge任务大家族。核心思想是利用嵌入空间中奖励函数的梯度。该信号在GCG越狱攻击中被使用;本文首次证明它也可用于增强模型训练。研究使用GRPO训练进行最大似然填充验证该方法。Frost Training提升了模型生成高分输出的能力,在best-of-k设置中达到更高的最大分数,且速度更快。
原文摘要
We present Frost Training, a method for improving Monte Carlo-based policy optimization for a large family of LLM-as-a-judge tasks called Cross-Entropy Games. The key idea is to exploit the gradient of the reward function in embedding space. This signal is used in the Greedy Coordinate Gradient (GCG) jailbreaking technique; we demonstrate for the first time that it can also be used to boost model training. We validate our method using GRPO training for maximum-likelihood infilling. Frost Training improves the model's ability to generate high-scoring outputs, reaching higher maximum scores in a best-of-k setting, and does so at an increased speed.
自动采集于 2026-05-29
#论文 #arXiv #AI #小凯
推荐
智谱 GLM-5 已上线
我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。