Loading...
正在加载...
请稍候

ELLA Efficient Lifelong Learning for LLMs

✨步子哥 (steper) 2026年01月09日 11:42
<!DOCTYPE html> <html lang="zh-CN"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>ELLA Framework Poster</title> <style> <span class="mention-invalid">@import</span> url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;600;800&family=JetBrains+Mono:wght@400;700&display=swap'); * { box-sizing: border-box; margin: 0; padding: 0; } body { width: 720px; height: 960px; background: #050511; font-family: 'Inter', sans-serif; color: #e2e8f0; overflow: hidden; display: flex; flex-direction: column; position: relative; } /* Background Effects */ .bg-gradient { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background: radial-gradient(circle at 10% 20%, rgba(76, 29, 149, 0.25) 0%, transparent 40%), radial-gradient(circle at 90% 80%, rgba(6, 182, 212, 0.15) 0%, transparent 40%); z-index: 0; } .grid-lines { position: absolute; top: 0; left: 0; width: 100%; height: 100%; background-image: linear-gradient(rgba(255, 255, 255, 0.03) 1px, transparent 1px), linear-gradient(90deg, rgba(255, 255, 255, 0.03) 1px, transparent 1px); background-size: 40px 40px; z-index: 1; } /* Content Container */ .container { position: relative; z-index: 10; padding: 40px; height: 100%; display: flex; flex-direction: column; gap: 24px; } /* Header */ header { display: flex; justify-content: space-between; align-items: flex-start; border-bottom: 1px solid rgba(255, 255, 255, 0.1); padding-bottom: 20px; } .title-block h1 { font-size: 64px; font-weight: 800; line-height: 1; background: linear-gradient(135deg, #fff 0%, #94a3b8 100%); -webkit-background-clip: text; -webkit-text-fill-color: transparent; letter-spacing: -2px; margin-bottom: 8px; } .title-block p { font-size: 18px; color: #64748b; font-weight: 400; text-transform: uppercase; letter-spacing: 2px; } .badge { background: rgba(6, 182, 212, 0.1); border: 1px solid rgba(6, 182, 212, 0.3); color: #22d3ee; padding: 6px 12px; border-radius: 20px; font-size: 12px; font-weight: 600; text-transform: uppercase; } /* Main Grid */ .grid-layout { display: grid; grid-template-columns: 1fr 1fr; grid-template-rows: auto auto 1fr; gap: 20px; flex-grow: 1; } .full-width { grid-column: 1 / -1; } /* Cards */ .card { background: rgba(255, 255, 255, 0.03); border: 1px solid rgba(255, 255, 255, 0.08); border-radius: 16px; padding: 20px; backdrop-filter: blur(10px); display: flex; flex-direction: column; gap: 12px; transition: transform 0.3s ease; } .card-title { font-size: 16px; font-weight: 600; color: #a5b4fc; display: flex; align-items: center; gap: 8px; } .card-title::before { content: ''; display: block; width: 4px; height: 16px; background: #6366f1; border-radius: 2px; } .card-content { font-size: 14px; line-height: 1.6; color: #cbd5e1; } /* Highlighted Text */ .highlight { color: #fff; font-weight: 600; } /* Math Section */ .math-box { background: #0f172a; border-radius: 8px; padding: 15px; font-family: 'JetBrains Mono', monospace; color: #4ade80; font-size: 13px; text-align: center; border: 1px dashed #334155; margin-top: 5px; } /* Visual Representation of Subspaces */ .subspace-visual { display: flex; justify-content: center; align-items: center; gap: 10px; margin-top: 10px; } .circle { border-radius: 50%; display: flex; align-items: center; justify-content: center; font-size: 10px; font-weight: bold; backdrop-filter: blur(5px); border: 1px solid rgba(255,255,255,0.2); } .circle-past { width: 60px; height: 60px; background: rgba(99, 102, 241, 0.2); border-color: #6366f1; z-index: 1; } .circle-new { width: 50px; height: 50px; background: rgba(6, 182, 212, 0.2); border-color: #22d3ee; margin-left: -15px; z-index: 2; } .visual-label { text-align: center; font-size: 11px; color: #64748b; margin-top: 5px; } /* Results Section */ .stats-container { display: flex; justify-content: space-around; align-items: center; margin-top: 10px; } .stat-item { text-align: center; } .stat-value { font-size: 36px; font-weight: 800; background: linear-gradient(to bottom, #fff, #94a3b8); -webkit-background-clip: text; -webkit-text-fill-color: transparent; } .stat-desc { font-size: 12px; color: #64748b; text-transform: uppercase; margin-top: 4px; } .stat-plus { font-size: 24px; color: #22d3ee; vertical-align: super; } /* Footer */ .footer { margin-top: auto; border-top: 1px solid rgba(255, 255, 255, 0.1); padding-top: 15px; display: flex; justify-content: space-between; align-items: flex-end; font-size: 10px; color: #475569; } .source-link { color: #64748b; text-decoration: none; } .disclaimer { max-width: 70%; } /* Key Features List */ .features { list-style: none; display: grid; grid-template-columns: 1fr 1fr; gap: 10px; } .features li { display: flex; align-items: center; gap: 8px; font-size: 12px; color: #cbd5e1; } .features li::before { content: '✓'; color: #22d3ee; font-weight: bold; } </style> </head> <body> <div class="bg-gradient"></div> <div class="grid-lines"></div> <div class="container"> <header> <div class="title-block"> <h1>ELLA</h1> <p>Efficient Lifelong Learning for LLMs</p> </div> <div class="badge">SOTA 2026</div> </header> <div class="grid-layout"> <!-- Intro / Problem --> <div class="card full-width"> <div class="card-title">核心问题:灾难性遗忘</div> <div class="card-content"> 大语言模型(LLM)在顺序学习新任务时,容易覆盖旧知识。传统的<span class="highlight">正交更新</span>方法限制过于严苛,限制了知识的正向迁移。 <br><br> <strong>ELLA 突破:</strong> 引入<span class="highlight" style="color:#22d3ee">选择性子空间去相关</span>策略,通过轻量级正则化惩罚与旧任务关键方向的对齐,同时复用通用低能量子空间。 </div> </div> <!-- Innovation --> <div class="card" style="grid-row: span 2;"> <div class="card-title">技术原理</div> <div class="card-content"> <p>ELLA 通过各向异性收缩算子(Anisotropic Shrinkage Operator)限制干扰,实现稳定性与可塑性的平衡。</p> <div class="math-box"> L<sub>ELLA</sub> = || ΔW<sub>t</sub> ⊙ W<sub>past</sub> ||²<sub>F</sub> </div> <div style="font-size: 11px; text-align: center; color: #64748b; margin-top: 5px;"> 正则化项惩罚新更新与历史高能量方向的对齐 </div> <div style="margin-top: 20px;"> <div class="subspace-visual"> <div class="circle circle-past">W<sub>past</sub></div> <div class="circle circle-new">ΔW<sub>t</sub></div> </div> <div class="visual-label">选择性惩罚重叠子空间</div> </div> </div> </div> <!-- Key Features --> <div class="card"> <div class="card-title">框架优势</div> <ul class="features"> <li>无需存储旧数据</li> <li>无需扩展参数规模</li> <li>内存占用减少 35×</li> <li>增强零样本泛化能力</li> <li>计算开销极小</li> <li>适配 T5 / LLaMA</li> </ul> </div> <!-- Results --> <div class="card full-width" style="background: linear-gradient(90deg, rgba(99, 102, 241, 0.1) 0%, rgba(6, 182, 212, 0.05) 100%);"> <div class="card-title">实验表现</div> <div class="card-content"> 在多项基准测试中达到最先进(SOTA)性能,显著提升模型对新旧任务的兼顾能力。 <div class="stats-container"> <div class="stat-item"> <div class="stat-value">9.6<span class="stat-plus">%</span></div> <div class="stat-desc">准确率提升</div> </div> <div class="stat-item"> <div class="stat-value">35<span class="stat-plus">×</span></div> <div class="stat-desc">内存占用减少</div> </div> </div> </div> </div> </div> <div class="footer"> <div class="disclaimer"> 声明:视频由NotebookLM自动生成,资料来源:ELLA: Efficient Lifelong Learning for Adapters in Large Language Models (arXiv:2601.02232) </div> <div class="source-link">arxiv.org/pdf/2601.02232</div> </div> </div> </body> </html>

讨论回复

0 条回复

还没有人回复,快来发表你的看法吧!