<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<style>
/* CALM Poster Namespace */
.calm-poster-container {
width: 760px;
min-height: 1200px;
background-color: #ffffff;
color: #333;
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, "Noto Sans", sans-serif;
line-height: 1.5;
overflow: hidden; /* Hide scrollbars */
box-sizing: border-box;
margin: 0 auto;
border: 1px solid #ddd;
}
.calm-poster-container * {
box-sizing: border-box;
}
/* Header Section */
.calm-header {
background: linear-gradient(135deg, #003366 0%, #0056b3 100%);
color: white;
padding: 40px 30px;
text-align: center;
}
.calm-title {
font-size: 42px;
font-weight: 800;
margin: 0 0 10px 0;
letter-spacing: 1px;
}
.calm-subtitle {
font-size: 20px;
font-weight: 300;
opacity: 0.9;
margin-bottom: 20px;
}
.calm-meta {
font-size: 14px;
background: rgba(255,255,255,0.1);
display: inline-block;
padding: 5px 15px;
border-radius: 20px;
}
/* Content Layout */
.calm-content {
padding: 30px;
display: grid;
grid-template-columns: 1fr;
gap: 30px;
}
/* Section Styling */
.calm-section {
background: #f8fbff;
border-left: 5px solid #0056b3;
padding: 20px;
border-radius: 4px;
box-shadow: 0 2px 5px rgba(0,0,0,0.05);
}
.calm-section-title {
font-size: 22px;
color: #003366;
margin-top: 0;
margin-bottom: 15px;
border-bottom: 1px solid #e0e0e0;
padding-bottom: 10px;
font-weight: 700;
display: flex;
align-items: center;
}
.calm-section-title::before {
content: '';
display: inline-block;
width: 10px;
height: 10px;
background: #0056b3;
margin-right: 10px;
border-radius: 50%;
}
.calm-text {
font-size: 15px;
text-align: justify;
margin-bottom: 15px;
}
/* Special Layouts */
.calm-grid-2 {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 20px;
}
/* Diagram Placeholders (CSS Shapes) */
.calm-diagram-container {
display: flex;
justify-content: center;
align-items: center;
margin: 20px 0;
background: #fff;
border: 1px dashed #ccc;
padding: 15px;
border-radius: 5px;
}
.calm-flow-block {
background: #e6f0ff;
border: 2px solid #0056b3;
color: #003366;
padding: 8px 12px;
text-align: center;
font-weight: bold;
font-size: 12px;
border-radius: 4px;
margin: 0 5px;
position: relative;
}
.calm-arrow {
font-size: 20px;
color: #666;
font-weight: bold;
}
/* Code/Markdown Block */
.calm-code-block {
background: #282c34;
color: #abb2bf;
padding: 15px;
border-radius: 5px;
font-family: 'Consolas', 'Monaco', 'Courier New', monospace;
font-size: 13px;
overflow-x: auto;
margin: 15px 0;
border-left: 4px solid #61afef;
}
.calm-code-comment { color: #5c6370; font-style: italic; }
.calm-code-keyword { color: #c678dd; }
.calm-code-func { color: #61afef; }
.calm-code-string { color: #98c379; }
/* Highlight Box */
.calm-highlight {
background-color: #e3f2fd;
border: 1px solid #90caf9;
padding: 10px;
border-radius: 4px;
margin: 10px 0;
}
.calm-highlight strong {
color: #1565c0;
}
/* Stats Section */
.calm-stats {
display: flex;
justify-content: space-around;
margin-top: 20px;
}
.calm-stat-item {
text-align: center;
}
.calm-stat-number {
font-size: 36px;
font-weight: 900;
color: #d32f2f; /* Red for negative reduction */
display: block;
}
.calm-stat-label {
font-size: 14px;
color: #555;
text-transform: uppercase;
}
/* Footer */
.calm-footer {
background: #f1f1f1;
padding: 20px 30px;
font-size: 12px;
color: #666;
text-align: center;
border-top: 1px solid #ddd;
margin-top: 30px;
}
/* Inline Math Style */
.calm-math {
font-family: "Times New Roman", Times, serif;
font-style: italic;
background: #f0f0f0;
padding: 0 4px;
border-radius: 3px;
}
</style>
</head>
<body>
<div class="calm-poster-container">
<!-- Header -->
<header class="calm-header">
<h1 class="calm-title">CALM: 连续自回归语言模型</h1>
<div class="calm-subtitle">打破 LLM 效率瓶颈:从离散 Token 到连续向量的范式转变</div>
<div class="calm-meta">论文: Continuous Autoregressive Language Models | WeChat AI & Tsinghua University</div>
</header>
<div class="calm-content">
<!-- Section 1: The Bottleneck -->
<div class="calm-section">
<h2 class="calm-section-title">核心瓶颈:低语义带宽</h2>
<p class="calm-text">
传统大型语言模型(LLM)受限于逐个生成 Token 的机制。虽然模型参数已扩展至万亿级别,但基本预测单元——离散 Token——的信息密度极低(仅 15-18 bits)。
</p>
<div class="calm-highlight">
<strong>问题所在:</strong> 扩大词汇表以增加信息密度会导致 Softmax 计算量指数级爆炸。这造成了模型强大算力与简单低效任务之间的错配。
</div>
<p class="calm-text">
<strong>解决思路:</strong> CALM 引入新的扩展维度——<span style="color:#0056b3; font-weight:bold;">语义带宽 (Semantic Bandwidth)</span>。不再预测下一个“Token”,而是预测一个能浓缩多个 Token 信息的“连续向量”。
</p>
</div>
<!-- Section 2: Architecture Overview -->
<div class="calm-section">
<h2 class="calm-section-title">架构原理:Next-Vector Prediction</h2>
<p class="calm-text">
CALM 利用高保真自编码器将 K 个 Token 压缩为一个连续向量 z,然后在向量空间进行自回归建模,最后解码回文本。这使生成步骤减少了 K 倍。
</p>
<div class="calm-diagram-container">
<div class="calm-flow-block">Tokens x<sub>1:K</sub></div>
<span class="calm-arrow">→</span>
<div class="calm-flow-block" style="background:#fff3cd; border-color:#ffc107;">Encoder</div>
<span class="calm-arrow">→</span>
<div class="calm-flow-block" style="background:#d1ecf1; border-color:#17a2b8; width: 80px;">Vector z</div>
<span class="calm-arrow">→</span>
<div class="calm-flow-block" style="background:#d4edda; border-color:#28a745;">Transformer</div>
<span class="calm-arrow">→</span>
<div class="calm-flow-block">Next Vector z'</div>
</div>
<div class="calm-grid-2">
<div>
<h4 style="margin:0 0 10px 0; color:#0056b3;">1. 自编码器 (Autoencoder)</h4>
<p class="calm-text" style="font-size:13px;">负责 Token 与向量间的双向映射。不仅要重构准确(>99.9%),更要<strong>鲁棒</strong>,防止向量微小扰动导致重构结果面目全非。</p>
</div>
<div>
<h4 style="margin:0 0 10px 0; color:#0056b3;">2. 生成模型 (Generative Model)</h4>
<p class="calm-text" style="font-size:13px;">在连续向量空间预测。由于没有有限词汇表,无法使用 Softmax,必须采用<strong>无似然 (Likelihood-free)</strong> 方法。</p>
</div>
</div>
</div>
<!-- Section 3: Robust Autoencoder -->
<div class="calm-section">
<h2 class="calm-section-title">构建鲁棒的向量空间</h2>
<p class="calm-text">普通自编码器过于“脆弱”。CALM 采用变分自编码器 (VAE) 并结合多项正则化技术来平滑潜在流形:</p>
<ul class="calm-text">
<li><strong>变分正则化:</strong> 编码器输出高斯分布,加入 KL 散度损失,使潜在空间平滑。</li>
<li><strong>KL Clipping:</strong> 设定 KL 损失下限,防止“后验坍塌”(Posterior Collapse),确保所有维度都编码有效信息。</li>
<li><strong>Dropout 增强:</strong> 对输入 Token 和潜在向量随机 Dropout,迫使模型学习冗余表示,提高抗噪能力。</li>
</ul>
</div>
<!-- Section 4: Likelihood-Free Framework -->
<div class="calm-section">
<h2 class="calm-section-title">无似然建模与评估工具箱</h2>
<p class="calm-text">在连续域中,无法计算概率密度。CALM 开发了一套全新的工具:</p>
<h4 style="color:#0056b3;">1. Energy Score (能量得分) - 训练目标</h4>
<p class="calm-text">代替 Cross-Entropy,通过样本间的距离来评估分布质量。它包含两个竞争项:多样性 (Diversity) 和 保真度 (Fidelity)。</p>
<div class="calm-code-block">
<span class="calm-code-comment"># Energy Score 定义 (Python 风格伪代码)</span>
<span class="calm-code-keyword">def</span> <span class="calm-code-func">energy_score</span>(samples, ground_truth):
<span class="calm-code-comment"># samples: 从模型采样的多个向量</span>
<span class="calm-code-comment"># ground_truth: 真实的目标向量</span>
diversity = average_distance(samples) <span class="calm-code-comment"># 鼓励样本之间分开</span>
fidelity = average_distance_to_target(samples, ground_truth) <span class="calm-code-comment"># 鼓励靠近真实值</span>
<span class="calm-code-keyword">return</span> diversity - 2 * fidelity
</div>
<h4 style="color:#0056b3;">2. BrierLM - 评估指标</h4>
<p class="calm-text">基于经典的 Brier Score,利用样本碰撞概率进行无偏估计,替代 Perplexity,用于公平评估生成质量。</p>
<h4 style="calm-code-keyword;">3. 无似然温度采样</h4>
<p class="calm-text">通过拒绝采样算法,在仅有黑盒采样器的情况下,模拟出调整 Temperature 后的分布,实现可控生成。</p>
</div>
<!-- Section 5: Efficiency Breakthrough -->
<div class="calm-section" style="background: #e8f5e9; border-left-color: #2e7d32;">
<h2 class="calm-section-title" style="color:#1b5e20;">效率突破:显著降低计算量</h2>
<p class="calm-text">实验证明,CALM 在达到甚至超越标准 Transformer 性能的同时,大幅降低了计算消耗。</p>
<div class="calm-stats">
<div class="calm-stat-item">
<span class="calm-stat-number">-44%</span>
<span class="calm-stat-label">训练 FLOPs</span>
</div>
<div class="calm-stat-item">
<span class="calm-stat-number">-34%</span>
<span class="calm-stat-label">推理 FLOPs</span>
</div>
</div>
<p class="calm-text" style="margin-top:15px; font-size:13px; text-align:center;">
*基于相同或更优性能下的对比实验结果 (Transformer-S vs CALM-L)
</p>
</div>
<!-- Section 6: Design Philosophy -->
<div class="calm-section">
<h2 class="calm-section-title">设计思想与未来展望</h2>
<p class="calm-text">
CALM 的成功验证了“语义带宽”作为 LLM 扩展新维度的可行性。它不仅是工程上的优化,更是范式的转移:
</p>
<ol class="calm-text">
<li><strong>语义带宽缩放定律:</strong> 未来模型优化不仅仅依靠增加参数量,还可以通过增加 K(每个向量包含的 Token 数)来提升效率。</li>
<li><strong>连续即未来:</strong> 连续表示能承载比离散 ID 更丰富的信息,是通往超高效 AI 模型的关键路径。</li>
</ol>
</div>
</div>
<!-- Footer -->
<footer class="calm-footer">
<p>Based on the paper "Continuous Autoregressive Language Models" by Shao et al. (2025).</p>
<p>Generated for educational purposes. Source: arXiv:2510.27688</p>
</footer>
</div>
</body>
</html>
登录后可参与表态
讨论回复
0 条回复还没有人回复,快来发表你的看法吧!