<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>语言化采样:如何激发大模型内省并释放多样性</title>
<link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet">
<link href="https://fonts.googleapis.com/css2?family=Noto+Sans+SC:wght@400;500;700;900&display=swap" rel="stylesheet">
<style>
:root {
--primary: #3a1c71;
--secondary: #d76d77;
--tertiary: #ffaf7b;
--accent: #4b0082;
--light: #f8f9fa;
--dark: #212529;
--gradient: linear-gradient(135deg, var(--primary) 0%, var(--secondary) 50%, var(--tertiary) 100%);
}
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: 'Noto Sans SC', sans-serif;
background-color: #f0f2f5;
color: var(--dark);
line-height: 1.6;
}
.poster-container {
width: 720px;
min-height: 960px;
margin: 0 auto;
background: linear-gradient(145deg, #f5f7fa 0%, #e4e8f0 100%);
padding: 40px;
position: relative;
overflow: hidden;
}
.bg-pattern {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
background-image:
radial-gradient(circle at 10% 20%, rgba(58, 28, 113, 0.05) 0%, transparent 20%),
radial-gradient(circle at 90% 30%, rgba(215, 109, 119, 0.07) 0%, transparent 25%),
radial-gradient(circle at 50% 70%, rgba(255, 175, 123, 0.06) 0%, transparent 30%),
linear-gradient(45deg, rgba(75, 0, 130, 0.03) 0%, transparent 70%);
z-index: 0;
}
.bg-grid {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
background-size: 20px 20px;
background-image:
linear-gradient(to right, rgba(58, 28, 113, 0.03) 1px, transparent 1px),
linear-gradient(to bottom, rgba(58, 28, 113, 0.03) 1px, transparent 1px);
z-index: 0;
}
.content {
position: relative;
z-index: 1;
}
.header {
text-align: center;
margin-bottom: 30px;
}
.title {
font-size: 48px;
font-weight: 900;
color: var(--primary);
margin-bottom: 10px;
line-height: 1.2;
background: var(--gradient);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
text-shadow: 2px 2px 4px rgba(0,0,0,0.1);
}
.subtitle {
font-size: 22px;
color: var(--secondary);
font-weight: 500;
}
.section {
background: rgba(255, 255, 255, 0.85);
border-radius: 16px;
padding: 25px;
margin-bottom: 25px;
box-shadow: 0 8px 20px rgba(0,0,0,0.08);
backdrop-filter: blur(10px);
border: 1px solid rgba(255,255,255,0.2);
}
.section-title {
font-size: 28px;
font-weight: 700;
color: var(--primary);
margin-bottom: 15px;
display: flex;
align-items: center;
}
.section-title .material-icons {
margin-right: 10px;
color: var(--secondary);
}
.highlight {
background: linear-gradient(transparent 60%, rgba(215, 109, 119, 0.3) 40%);
padding: 0 3px;
font-weight: 500;
}
.card {
background: rgba(255, 255, 255, 0.9);
border-radius: 12px;
padding: 20px;
margin: 15px 0;
border-left: 4px solid var(--secondary);
}
.vs-mechanism {
display: flex;
justify-content: space-between;
margin: 20px 0;
}
.step {
flex: 1;
background: white;
border-radius: 12px;
padding: 15px;
margin: 0 5px;
box-shadow: 0 4px 10px rgba(0,0,0,0.05);
position: relative;
}
.step-number {
position: absolute;
top: -10px;
left: -10px;
width: 30px;
height: 30px;
background: var(--secondary);
color: white;
border-radius: 50%;
display: flex;
align-items: center;
justify-content: center;
font-weight: bold;
}
.step-title {
font-weight: 700;
color: var(--primary);
margin-bottom: 10px;
font-size: 18px;
}
.code-block {
background: #f5f5f5;
border-radius: 8px;
padding: 15px;
margin: 15px 0;
font-family: monospace;
font-size: 14px;
overflow-x: auto;
border-left: 3px solid var(--accent);
}
.results {
display: flex;
justify-content: space-around;
margin: 20px 0;
}
.result-item {
text-align: center;
flex: 1;
}
.result-number {
font-size: 36px;
font-weight: 900;
color: var(--secondary);
margin-bottom: 5px;
}
.result-label {
font-size: 16px;
color: var(--dark);
}
.comparison {
display: flex;
margin: 20px 0;
}
.comparison-item {
flex: 1;
padding: 15px;
background: white;
border-radius: 12px;
margin: 0 5px;
}
.comparison-title {
font-weight: 700;
color: var(--primary);
margin-bottom: 10px;
text-align: center;
}
.conclusion {
background: var(--gradient);
color: white;
border-radius: 16px;
padding: 25px;
margin-top: 30px;
}
.conclusion-title {
font-size: 24px;
font-weight: 700;
margin-bottom: 15px;
}
.hand-drawn {
position: relative;
}
.hand-drawn::after {
content: '';
position: absolute;
bottom: -5px;
left: 0;
width: 100%;
height: 3px;
background: var(--tertiary);
border-radius: 50%;
transform: rotate(-1deg);
}
</style>
</head>
<body>
<div class="poster-container">
<div class="bg-pattern"></div>
<div class="bg-grid"></div>
<div class="content">
<header class="header">
<h1 class="title">语言化采样:如何激发大模型内省并释放多样性</h1>
<p class="subtitle">VERBALIZED SAMPLING: HOW TO MITIGATE MODE COLLAPSE AND UNLOCK LLM DIVERSITY</p>
</header>
<section class="section">
<h2 class="section-title">
<i class="material-icons">warning</i>
问题背景:模式崩溃的困境
</h2>
<p>过去两年,几乎所有经过对齐(alignment)的大语言模型——从GPT-4到Claude,再到DeepSeek——都出现了相似的症状:<span class="highlight">回答越来越像、语气越来越统一、创意越来越稀薄</span>。无论模型多大、训练多精,它们似乎都在被推向一个"平均答案"的极限。</p>
<div class="card">
<p>研究发现,这并非算法退化,而是后训练阶段普遍存在的一种系统性收缩:模型越被"安全对齐",输出越趋于同质。这种模式崩溃的根本原因是偏好数据中的<span class="highlight">典型性偏见(Typicality Bias)</span>——标注者更倾向于选择那些语言上更熟悉、更自然的答案,而非仅依据事实性或逻辑正确性进行判断。</p>
</div>
</section>
<section class="section">
<h2 class="section-title">
<i class="material-icons">psychology</i>
语言化采样:激发模型内省的方案
</h2>
<p>语言化采样(Verbalized Sampling, VS)是一种无需再训练的提示策略,通过让模型表达输出分布来缓解模式崩溃并提升多样性。它的核心思想是让模型用语言来verbalize其内部的概率分布,而不是直接从隐藏的logits中随机抽取样本。</p>
<h3 class="section-title" style="font-size: 22px;">
<i class="material-icons">settings</i>
工作原理
</h3>
<div class="vs-mechanism">
<div class="step">
<div class="step-number">1</div>
<div class="step-title">显式表达概率分布</div>
<p>通过简单的提示要求模型生成N个候选回答,并为每个回答提供一个显式概率</p>
</div>
<div class="step">
<div class="step-number">2</div>
<div class="step-title">语言化校准</div>
<p>模型在生成时会进行一种"语言化校准":它需要同时判断"有哪些可能的答案"以及"我对它们各自有多大信心"</p>
</div>
<div class="step">
<div class="step-number">3</div>
<div class="step-title">从自声明分布中采样</div>
<p>这些verbalized probabilities由模型自身估计,随后被归一化为一组可操作的采样权重</p>
</div>
</div>
<div class="code-block">
请生成5个可能的回答,并为每个回答给出你认为的概率。
</div>
<h3 class="section-title" style="font-size: 22px;">
<i class="material-icons">lightbulb</i>
VS如何激发模型内省
</h3>
<p>传统采样依赖模型内部的logits分布进行随机抽取。温度参数T越高,分布越平缓,多样性越强;T越低,输出越集中。然而,这种温度调整只是数学上的噪声控制,并未真正改变模型的"思考方式"——它仍然无法意识到自己在何处有不确定性。</p>
<div class="card">
<p>VS的关键在于让模型用语言来表达这一分布,从而真正改变了模型的思考方式。研究发现,这些verbalized probabilities与模型内部置信度高度相关——当模型自评70%把握时,其实际正确率往往接近0.7。</p>
</div>
</section>
<section class="section">
<h2 class="section-title">
<i class="material-icons">analytics</i>
实验结果
</h2>
<p>在系统评测中,VS让模型的输出多样性在创意写作任务中显著提升,人工评价分数提高,并恢复了大部分的预对齐多样性——所有这些改进,都不需要任何额外训练。</p>
<div class="results">
<div class="result-item">
<div class="result-number">1.6-2.1×</div>
<div class="result-label">多样性提升</div>
</div>
<div class="result-item">
<div class="result-number">25.7%</div>
<div class="result-label">人工评价分数提高</div>
</div>
<div class="result-item">
<div class="result-number">66.8%</div>
<div class="result-label">预对齐多样性恢复</div>
</div>
</div>
</section>
<section class="section">
<h2 class="section-title">
<i class="material-icons">code</i>
实际应用
</h2>
<h3 class="section-title" style="font-size: 22px;">
<i class="material-icons">chat</i>
基本用法
</h3>
<div class="code-block">
Generate 5 responses to the user query, each within a separate <response> tag. Each <response> must include a <text> and a numeric <probability>. Please sample at random from the tails of the distribution, such that the probability of each response is less than 0.10.
</div>
<h3 class="section-title" style="font-size: 22px;">
<i class="material-icons">integration_instructions</i>
代码示例
</h3>
<div class="code-block">
from verbalized_sampling import verbalize<br><br>
# Generate distribution of responses<br>
dist = verbalize("Tell me a joke", k=5, tau=0.10, temperature=0.9)<br><br>
# Sample from the distribution<br>
joke = dist.sample(seed=42)<br>
print(joke.text)
</div>
</section>
<div class="conclusion">
<h2 class="conclusion-title">
<i class="material-icons">insights</i>
结论
</h2>
<p>语言化采样提供了一种务实的工程解法,它提醒我们,提升模型能力,不一定要更大的网络或更贵的训练,也可以来自更聪明的提问方式。通过显式verbalization,模型能够在事实正确与表达多样之间找到新的平衡:既保持可靠性,又能展现思维的宽度。</p>
<p style="margin-top: 15px;">VS不仅恢复了多样性,也提升了生成置信度的一致性。它让我们重新思考"大模型的输出到底代表什么"——不仅是一个被优化出的答案,更是模型对不确定性的表达。</p>
</div>
</div>
</div>
</body>
</html>
登录后可参与表态
讨论回复
0 条回复还没有人回复,快来发表你的看法吧!