Loading...
正在加载...
请稍候

RWKV-7 "Goose" 截至 2026 年初 RWKV 模型性能总结

✨步子哥 (steper) 2026年02月12日 14:06
<!DOCTYPE html> <html lang="zh-CN"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>RWKV 模型性能总结</title> <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;600;700&family=JetBrains+Mono:wght@400;700&display=swap" rel="stylesheet"> <link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet"> <style> :root { --bg-color: #0f1115; --card-bg: #1a1d24; --primary: #00e5ff; /* Cyan */ --secondary: #ff9100; /* Orange */ --text-main: #ffffff; --text-muted: #b0b3b8; --accent-green: #00e676; --border-radius: 12px; } * { box-sizing: border-box; margin: 0; padding: 0; } body { background-color: var(--bg-color); font-family: 'Inter', sans-serif; color: var(--text-main); display: flex; justify-content: center; align-items: center; min-height: 100vh; } .poster-container { width: 720px; min-height: 1280px; background: linear-gradient(135deg, #0f1115 0%, #1a1f2e 100%); padding: 40px; display: flex; flex-direction: column; gap: 24px; position: relative; overflow: hidden; } /* Background Decoration */ .bg-decoration { position: absolute; top: -100px; right: -100px; width: 400px; height: 400px; background: radial-gradient(circle, rgba(0, 229, 255, 0.1) 0%, rgba(0, 0, 0, 0) 70%); border-radius: 50%; z-index: 0; } .bg-decoration-2 { position: absolute; bottom: 0; left: 0; width: 100%; height: 300px; background: linear-gradient(to top, rgba(255, 145, 0, 0.05), transparent); z-index: 0; } header { z-index: 1; border-bottom: 2px solid rgba(255, 255, 255, 0.1); padding-bottom: 20px; } h1 { font-family: 'JetBrains Mono', monospace; font-size: 48px; font-weight: 700; color: var(--primary); line-height: 1.1; margin-bottom: 8px; text-shadow: 0 0 10px rgba(0, 229, 255, 0.3); } .subtitle { font-size: 18px; color: var(--text-muted); display: flex; align-items: center; gap: 8px; } .tag { background: rgba(0, 229, 255, 0.15); color: var(--primary); padding: 4px 8px; border-radius: 4px; font-size: 12px; font-weight: 600; font-family: 'JetBrains Mono', monospace; } .card { background: var(--card-bg); border-radius: var(--border-radius); padding: 20px; border: 1px solid rgba(255, 255, 255, 0.05); box-shadow: 0 4px 20px rgba(0, 0, 0, 0.2); z-index: 1; position: relative; } .section-title { font-size: 20px; font-weight: 700; margin-bottom: 16px; display: flex; align-items: center; gap: 8px; color: var(--text-main); } .section-title i { color: var(--secondary); } /* Core Advantages */ .advantages-grid { display: grid; grid-template-columns: repeat(2, 1fr); gap: 12px; } .adv-item { background: rgba(255, 255, 255, 0.03); padding: 12px; border-radius: 8px; border-left: 3px solid var(--primary); } .adv-item h3 { font-size: 14px; color: var(--primary); margin-bottom: 4px; font-weight: 600; } .adv-item p { font-size: 12px; color: var(--text-muted); line-height: 1.4; } /* Benchmark Table */ .table-container { width: 100%; overflow: hidden; } table { width: 100%; border-collapse: collapse; font-size: 12px; } th { text-align: left; color: var(--text-muted); padding: 8px 6px; border-bottom: 1px solid rgba(255, 255, 255, 0.1); font-weight: 600; white-space: nowrap; } td { padding: 8px 6px; border-bottom: 1px solid rgba(255, 255, 255, 0.05); font-family: 'JetBrains Mono', monospace; color: var(--text-main); white-space: nowrap; } tr:last-child td { border-bottom: none; } .highlight-text { color: var(--accent-green); font-weight: bold; } .model-name { color: var(--secondary); font-weight: 700; } /* Inference Stats */ .stats-grid { display: grid; grid-template-columns: repeat(2, 1fr); gap: 16px; } .stat-box { background: rgba(0, 0, 0, 0.2); padding: 16px; border-radius: 8px; text-align: center; } .stat-value { font-size: 28px; font-weight: 700; color: var(--primary); font-family: 'JetBrains Mono', monospace; margin-bottom: 4px; } .stat-label { font-size: 12px; color: var(--text-muted); text-transform: uppercase; letter-spacing: 1px; } .stat-sub { font-size: 11px; color: var(--text-muted); margin-top: 4px; } .comparison-badge { display: inline-block; background: rgba(255, 145, 0, 0.15); color: var(--secondary); padding: 2px 6px; border-radius: 4px; font-size: 10px; margin-top: 4px; } /* Usage Section */ .tips-list { display: flex; flex-direction: column; gap: 12px; } .tip-item { display: flex; align-items: flex-start; gap: 10px; font-size: 13px; color: var(--text-muted); line-height: 1.5; } .tip-item i { color: var(--primary); font-size: 16px; margin-top: 2px; } .tip-strong { color: var(--text-main); font-weight: 600; } footer { margin-top: auto; text-align: center; padding-top: 16px; border-top: 1px solid rgba(255, 255, 255, 0.05); z-index: 1; } .links-grid { display: flex; justify-content: center; gap: 24px; flex-wrap: wrap; } .link-item { font-size: 12px; color: var(--text-muted); display: flex; align-items: center; gap: 4px; } .link-item i { font-size: 14px; color: var(--secondary); } </style> </head> <body> <div class="poster-container"> <div class="bg-decoration"></div> <div class="bg-decoration-2"></div> <header> <div class="subtitle"> <span class="tag">RWKV-7 "Goose"</span> <span>截至 2026 年初</span> </div> <h1>RWKV 模型性能总结</h1> <div style="margin-top: 8px; font-size: 14px; color: rgba(255,255,255,0.7);"> 纯 RNN 架构 · 无注意力机制 · 线性推理 </div> </header> <!-- Core Advantages --> <div class="card"> <div class="section-title"> <i class="material-icons">bolt</i> 核心优势 </div> <div class="advantages-grid"> <div class="adv-item"> <h3>推理时间线性</h3> <p>推理耗时与序列长度无关,无二次复杂度瓶颈。</p> </div> <div class="adv-item"> <h3>显存恒定</h3> <p>无 KV Cache,显存占用极低,支持无限上下文。</p> </div> <div class="adv-item"> <h3>训练可并行</h3> <p>像 Transformer 一样高效训练,打破 RNN 串行限制。</p> </div> <div class="adv-item"> <h3>极致高效</h3> <p>手机/集显即可实时运行,功耗优势巨大。</p> </div> </div> </div> <!-- Benchmark --> <div class="card"> <div class="section-title"> <i class="material-icons">leaderboard</i> RWKV-7 基准测试表现 </div> <div class="table-container"> <table> <thead> <tr> <th>模型规模</th> <th>MMLU</th> <th>GSM8K</th> <th>MATH</th> <th>IFEval</th> <th style="color: var(--accent-green)">Uncheatable</th> </tr> </thead> <tbody> <tr> <td class="model-name">13.3B (G0b)</td> <td>76.5%</td> <td>92.3%</td> <td>76.8%</td> <td>68.9%</td> <td class="highlight-text">6.843 (Best)</td> </tr> <tr> <td class="model-name">7.2B (G0a3)</td> <td>65.1%</td> <td>83.9%</td> <td>67.8%</td> <td>58.0%</td> <td>7.222</td> </tr> <tr> <td class="model-name">2.9B (G1a4)</td> <td>61.3%</td> <td>77.3%</td> <td>48.2%</td> <td>51.0%</td> <td>7.486</td> </tr> <tr> <td class="model-name">1.5B (G1b)</td> <td>50.5%</td> <td>58.5%</td> <td>29.8%</td> <td>42.1%</td> <td>7.969</td> </tr> </tbody> </table> </div> <div style="margin-top: 10px; font-size: 11px; color: var(--text-muted);"> * Uncheatable Eval 数值越低越好;13.3B 优于 Qwen3-14B。 </div> </div> <!-- Inference Performance --> <div class="card"> <div class="section-title"> <i class="material-icons">speed</i> 推理性能实测 (RWKV-7 2.9B) </div> <div class="stats-grid"> <div class="stat-box"> <div class="stat-value">115<span style="font-size:14px"> t/s</span></div> <div class="stat-label">RTX 4090</div> <div class="stat-sub">nf4 量化 · 2.4 GB VRAM</div> </div> <div class="stat-box"> <div class="stat-value">86<span style="font-size:14px"> t/s</span></div> <div class="stat-label">RTX 4060 Laptop</div> <div class="stat-sub">nf4 量化 · 2.4 GB VRAM</div> </div> <div class="stat-box"> <div class="stat-value">30<span style="font-size:14px">+ t/s</span></div> <div class="stat-label">手机 S8 Gen 3</div> <div class="stat-sub">A16W4 量化 · 边缘可用</div> </div> <div class="stat-box"> <div class="stat-value">6.5<span style="font-size:14px"> t/s</span></div> <div class="stat-label">RK3588 NPU</div> <div class="stat-sub">W8A8 · 嵌入式设备</div> </div> </div> <div style="margin-top: 12px; text-align: center;"> <span class="comparison-badge">对比 Transformer: 速度快 3-10 倍 · 显存仅需 1/3</span> </div> </div> <!-- Usage Tips --> <div class="card" style="flex: 1;"> <div class="section-title"> <i class="material-icons">lightbulb</i> 使用建议与资源 </div> <div class="tips-list"> <div class="tip-item"> <i class="material-icons">stars</i> <div> <span class="tip-strong">追求最强性能:</span> 选择 RWKV-7 13.3B / 7.2B,性能接近/超过主流 Transformer。 </div> </div> <div class="tip-item"> <i class="material-icons">smartphone</i> <div> <span class="tip-strong">手机/笔记本部署:</span> 推荐 2.9B G1 系列 (GGUF),普通硬件即可流畅运行。 </div> </div> <div class="tip-item"> <i class="material-icons">public</i> <div> <span class="tip-strong">多语言任务:</span> 优选 World 系列,多语言基准达 SOTA 水平。 </div> </div> <div class="tip-item"> <i class="material-icons">code</i> <div> <span class="tip-strong">推荐后端:</span> <span style="color:var(--primary)">web-rwkv</span> (最快), <span style="color:var(--primary)">llama.cpp</span> (通用)。 </div> </div> </div> </div> <footer> <div class="links-grid"> <div class="link-item"><i class="material-icons">language</i> www.rwkv.com</div> <div class="link-item"><i class="material-icons">menu_book</i> wiki.rwkv.com</div> <div class="link-item"><i class="material-icons">cloud_download</i> Hugging Face: BlinkDL/rwkv7-g1</div> </div> </footer> </div> </body> </html>

讨论回复

1 条回复
✨步子哥 (steper) #1
02-12 14:27
<!DOCTYPE html> <html lang="zh-CN"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Web-RWKV - 纯WebGPU推理引擎</title> <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;600;700&family=JetBrains+Mono:wght@400;600;700&display=swap" rel="stylesheet"> <link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet"> <style> :root { --bg-color: #1e1e1e; /* VS Code Dark+ */ --card-bg: #252526; --code-bg: #1e1e1e; --primary: #569cd6; /* Blue keyword */ --secondary: #ce9178; /* Orange string */ --accent: #4ec9b0; /* Cyan type */ --text-main: #d4d4d4; --text-muted: #858585; --success: #6a9955; /* Green comment */ --warning: #dcdcaa; /* Yellow function */ --purple: #c586c0; /* Purple control */ --border-radius: 8px; } * { box-sizing: border-box; margin: 0; padding: 0; } body { background-color: #121212; font-family: 'Inter', sans-serif; color: var(--text-main); display: flex; justify-content: center; align-items: center; min-height: 100vh; } .poster-container { width: 720px; min-height: 1280px; background: var(--bg-color); padding: 32px; display: flex; flex-direction: column; gap: 24px; position: relative; overflow: hidden; border: 1px solid #333; } /* Editor Header */ .editor-header { display: flex; align-items: center; gap: 12px; padding-bottom: 16px; border-bottom: 1px solid #333; } .window-controls { display: flex; gap: 8px; } .window-btn { width: 12px; height: 12px; border-radius: 50%; background: #333; } .window-btn.close { background: #f5475b; } .window-btn.min { background: #fbd84b; } .window-btn.max { background: #37d283; } .file-path { font-family: 'JetBrains Mono', monospace; font-size: 13px; color: var(--text-muted); } header { border-bottom: 1px solid #333; padding-bottom: 16px; } h1 { font-family: 'JetBrains Mono', monospace; font-size: 42px; font-weight: 700; color: var(--warning); /* Function color */ margin-bottom: 12px; } .subtitle { font-size: 16px; color: var(--accent); /* Type color */ margin-bottom: 8px; } .badges { display: flex; gap: 8px; flex-wrap: wrap; } .badge { background: rgba(86, 156, 214, 0.15); color: var(--primary); padding: 4px 8px; border-radius: 4px; font-size: 12px; font-family: 'JetBrains Mono', monospace; border: 1px solid rgba(86, 156, 214, 0.3); } /* Code Block Style */ .code-block { background: var(--code-bg); border: 1px solid #333; border-radius: var(--border-radius); padding: 16px; font-family: 'JetBrains Mono', monospace; font-size: 13px; position: relative; } .code-title { position: absolute; top: 0; right: 0; background: #333; padding: 2px 8px; border-radius: 0 0 0 4px; font-size: 10px; color: var(--text-muted); } .section-title { font-size: 18px; font-weight: 600; color: var(--text-main); margin-bottom: 12px; display: flex; align-items: center; gap: 8px; font-family: 'JetBrains Mono', monospace; } .section-title::before { content: '//'; color: var(--success); margin-right: 4px; } .feature-list { display: grid; grid-template-columns: repeat(2, 1fr); gap: 10px; } .feature-item { display: flex; align-items: flex-start; gap: 8px; font-size: 13px; line-height: 1.4; } .feature-item i { color: var(--accent); font-size: 16px; margin-top: 2px; } .keyword { color: var(--purple); } .function { color: var(--warning); } .string { color: var(--secondary); } .type { color: var(--accent); } .comment { color: var(--success); font-style: italic; } .operator { color: var(--text-main); } .provides-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 12px; } .provide-card { background: rgba(255, 255, 255, 0.02); padding: 12px; border-radius: 6px; border-left: 3px solid var(--success); } .not-provide-card { background: rgba(255, 255, 255, 0.02); padding: 12px; border-radius: 6px; border-left: 3px solid #f5475b; } .card-header { font-weight: 600; font-size: 14px; margin-bottom: 6px; color: var(--text-main); } .card-content { font-size: 12px; color: var(--text-muted); } .example-command { padding: 8px 12px; background: rgba(255, 255, 255, 0.02); border-radius: 4px; margin-bottom: 6px; border-left: 2px solid var(--primary); font-family: 'JetBrains Mono', monospace; font-size: 12px; color: var(--text-main); } .hook-diagram { display: flex; gap: 8px; align-items: center; margin-top: 10px; font-size: 12px; } .hook-box { background: rgba(86, 156, 214, 0.1); border: 1px dashed var(--primary); padding: 6px; border-radius: 4px; text-align: center; font-size: 11px; color: var(--primary); } footer { margin-top: auto; padding-top: 16px; border-top: 1px solid #333; display: flex; justify-content: space-between; align-items: center; font-size: 12px; color: var(--text-muted); } .links { display: flex; gap: 16px; } .link { color: var(--accent); text-decoration: none; display: flex; align-items: center; gap: 4px; } </style> </head> <body> <div class="poster-container"> <!-- Editor Header Decoration --> <div class="editor-header"> <div class="window-controls"> <div class="window-btn close"></div> <div class="window-btn min"></div> <div class="window-btn max"></div> </div> <div class="file-path">~/projects/web-rwkv/README.md</div> </div> <header> <h1>Web-RWKV</h1> <div class="subtitle">Inference engine for RWKV implemented in pure WebGPU</div> <div class="badges"> <span class="badge">v0.10</span> <span class="badge">Rust</span> <span class="badge">WebGPU</span> <span class="badge">WASM</span> <span class="badge">Cross-Platform</span> </div> </header> <!-- Core Features --> <div class="code-block"> <div class="section-title">Core Features</div> <div class="feature-list"> <div class="feature-item"><i class="material-icons">check_circle</i>No CUDA/Python dependencies</div> <div class="feature-item"><i class="material-icons">check_circle</i>Support Nvidia/AMD/Intel GPUs</div> <div class="feature-item"><i class="material-icons">check_circle</i>Vulkan/Dx12/OpenGL backends</div> <div class="feature-item"><i class="material-icons">check_circle</i>WASM support (Browser ready)</div> <div class="feature-item"><i class="material-icons">check_circle</i>Batched inference</div> <div class="feature-item"><i class="material-icons">check_circle</i>Int8 and Float4 quantization</div> <div class="feature-item"><i class="material-icons">check_circle</i>Support RWKV V4 through V7</div> <div class="feature-item"><i class="material-icons">check_circle</i>LoRA merging & Model serialization</div> </div> </div> <!-- Scope --> <div class="code-block"> <div class="section-title">Functional Scope</div> <div class="provides-grid"> <div class="provide-card"> <div class="card-header">✅ Provides</div> <div class="card-content"> • Tokenizer<br> • Model Loading<br> • State Creation & Updating<br> • GPU-accelerated `run` & `softmax`<br> • Model Quantization </div> </div> <div class="not-provide-card"> <div class="card-header">❌ Does Not Provide</div> <div class="card-content"> • OpenAI-compatible API<br> • Built-in Samplers<br> • State Caching System<br> • Python Bindings </div> </div> </div> </div> <!-- Usage Examples --> <div class="code-block"> <div class="section-title">Usage Examples</div> <div class="example-command"> <span class="comment"># Performance Test (500 tokens)</span><br> cargo run --release --example gen </div> <div class="example-command"> <span class="comment"># Chat Demo</span><br> cargo run --release --example chat -- --model /path/to/model.st </div> <div class="example-command"> <span class="comment"># Quantization Example (First 32 layers)</span><br> cargo run --release --example chat -- --quant 32 </div> </div> <!-- Advanced Features --> <div class="code-block"> <div class="section-title">Advanced Features</div> <div style="margin-bottom: 10px;"> <span class="keyword">let</span> <span class="function">runtime</span> = TokioRuntime::new(bundle).await; <span class="comment">// Async runtime</span> </div> <div style="margin-bottom: 10px; font-size: 12px; color: var(--text-muted);"> The asynchronous runtime API allows CPU and GPU to work in parallel, maximizing utilization. </div> <div class="hook-diagram"> <div class="hook-box">Input Tokens</div> <span class="operator">→</span> <div class="hook-box" style="border-style: solid; border-color: var(--warning); color: var(--warning);">Hook Point</div> <span class="operator">→</span> <div class="hook-box">Output Logits</div> </div> <div style="margin-top: 8px; font-size: 12px;"> <span class="type">Hooks</span>: Inject tensor ops into inference process for dynamic LoRA, control net, etc. </div> </div> <!-- Convert & Troubleshoot --> <div class="code-block"> <div class="section-title">Model Conversion</div> <div class="example-command"> python assets/scripts/convert_safetensors.py <span class="operator">--input</span> model.pth <span class="operator">--output</span> model.st </div> </div> <footer> <div>© 2024 Web-RWKV Project</div> <div class="links"> <span class="link"><i class="material-icons" style="font-size:14px">code</i> GitHub</span> <span class="link"><i class="material-icons" style="font-size:14px">description</i> docs.rs/web-rwkv</span> </div> </footer> </div> </body> </html>