Agentic Context Engineering Evolving Contexts for Self-Improving Language Models

✨步子哥 (steper) • 2025年12月11日 08:10
                        <!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models</title>
    <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300;400;500;700&family=Roboto+Slab:wght@400;500;700&display=swap" rel="stylesheet">
    <link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet">
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }
        
        body {
            font-family: 'Roboto', sans-serif;
            background-color: #f8f9fa;
            color: #1a237e;
            line-height: 1.6;
        }
        
        .poster-container {
            width: 920px;
            min-height: 960px;
            margin: 0 auto;
            background: linear-gradient(135deg, #e8eaf6 0%, #c5cae9 100%);
            padding: 40px;
            position: relative;
            overflow: hidden;
        }
        
        .poster-container::before {
            content: "";
            position: absolute;
            top: -150px;
            right: -150px;
            width: 400px;
            height: 400px;
            border-radius: 50%;
            background: linear-gradient(45deg, rgba(63, 81, 181, 0.1), rgba(103, 58, 183, 0.1));
            z-index: 0;
        }
        
        .poster-container::after {
            content: "";
            position: absolute;
            bottom: -100px;
            left: -100px;
            width: 300px;
            height: 300px;
            border-radius: 50%;
            background: linear-gradient(45deg, rgba(63, 81, 181, 0.1), rgba(103, 58, 183, 0.1));
            z-index: 0;
        }
        
        .header {
            text-align: center;
            margin-bottom: 30px;
            position: relative;
            z-index: 1;
        }
        
        .title {
            font-family: 'Roboto Slab', serif;
            font-size: 42px;
            font-weight: 700;
            color: #303f9f;
            margin-bottom: 10px;
            line-height: 1.2;
        }
        
        .subtitle {
            font-size: 22px;
            font-weight: 500;
            color: #5c6bc0;
            margin-bottom: 20px;
        }
        
        .section {
            background-color: rgba(255, 255, 255, 0.85);
            border-radius: 12px;
            padding: 20px;
            margin-bottom: 25px;
            box-shadow: 0 4px 12px rgba(0, 0, 0, 0.08);
            position: relative;
            z-index: 1;
        }
        
        .section-title {
            font-family: 'Roboto Slab', serif;
            font-size: 28px;
            font-weight: 700;
            color: #3949ab;
            margin-bottom: 15px;
            display: flex;
            align-items: center;
        }
        
        .section-title .material-icons {
            margin-right: 10px;
            color: #5c6bc0;
        }
        
        .section-content {
            font-size: 18px;
        }
        
        .highlight {
            background: linear-gradient(transparent 40%, rgba(124, 77, 255, 0.2) 40%, rgba(124, 77, 255, 0.2) 85%, transparent 85%);
            padding: 0 2px;
        }
        
        .problem-box {
            background-color: rgba(239, 83, 80, 0.1);
            border-left: 4px solid #ef5350;
            padding: 15px;
            margin: 15px 0;
            border-radius: 0 8px 8px 0;
        }
        
        .solution-box {
            background-color: rgba(76, 175, 80, 0.1);
            border-left: 4px solid #4caf50;
            padding: 15px;
            margin: 15px 0;
            border-radius: 0 8px 8px 0;
        }
        
        .architecture {
            display: flex;
            justify-content: space-between;
            margin: 20px 0;
        }
        
        .component {
            flex: 1;
            background-color: rgba(63, 81, 181, 0.08);
            border-radius: 8px;
            padding: 15px;
            margin: 0 5px;
            text-align: center;
            box-shadow: 0 2px 6px rgba(0, 0, 0, 0.05);
        }
        
        .component-title {
            font-weight: 700;
            color: #3949ab;
            margin-bottom: 10px;
            font-size: 20px;
        }
        
        .component-desc {
            font-size: 16px;
        }
        
        .arrow {
            display: flex;
            align-items: center;
            justify-content: center;
            color: #5c6bc0;
        }
        
        .results-container {
            display: flex;
            justify-content: space-between;
            margin: 20px 0;
        }
        
        .result-box {
            flex: 1;
            background-color: rgba(63, 81, 181, 0.08);
            border-radius: 8px;
            padding: 15px;
            margin: 0 5px;
            text-align: center;
        }
        
        .result-number {
            font-size: 36px;
            font-weight: 700;
            color: #3949ab;
        }
        
        .result-label {
            font-size: 16px;
        }
        
        .efficiency-table {
            width: 100%;
            border-collapse: collapse;
            margin: 15px 0;
        }
        
        .efficiency-table th, .efficiency-table td {
            padding: 10px;
            text-align: left;
            border-bottom: 1px solid #e0e0e0;
        }
        
        .efficiency-table th {
            background-color: rgba(63, 81, 181, 0.1);
            color: #3949ab;
        }
        
        .context-image {
            width: 100%;
            border-radius: 8px;
            margin: 15px 0;
            box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
        }
        
        .footer {
            text-align: center;
            margin-top: 30px;
            font-size: 14px;
            color: #5c6bc0;
            position: relative;
            z-index: 1;
        }
        
        .bullet-list {
            list-style-type: none;
            padding-left: 0;
        }
        
        .bullet-list li {
            position: relative;
            padding-left: 25px;
            margin-bottom: 8px;
        }
        
        .bullet-list li::before {
            content: "•";
            position: absolute;
            left: 0;
            color: #5c6bc0;
            font-weight: bold;
        }
    </style>
</head>
<body>
    <div class="poster-container">
        <div class="header">
            <h1 class="title">Agentic Context Engineering</h1>
            <h2 class="subtitle">Evolving Contexts for Self-Improving Language Models</h2>
        </div>
        
        <div class="section">
            <h3 class="section-title">
                <i class="material-icons">info</i>
                Introduction
            </h3>
            <div class="section-content">
                <p>Large Language Model applications increasingly rely on <span class="highlight">context adaptation</span> rather than weight updates. Current approaches suffer from two critical limitations:</p>
                
                <div class="problem-box">
                    <p><strong>Brevity bias:</strong> Over-prioritizing concise summaries at the expense of detailed domain insights</p>
                    <p><strong>Context collapse:</strong> Iterative rewriting erodes details over time, leading to performance drops</p>
                </div>
                
                <p>ACE treats contexts as <span class="highlight">evolving playbooks</span> that accumulate, refine, and organize strategies through a modular process.</p>
            </div>
        </div>
        
        <div class="section">
            <h3 class="section-title">
                <i class="material-icons">architecture</i>
                Three-Role Architecture
            </h3>
            <div class="section-content">
                <div class="architecture">
                    <div class="component">
                        <div class="component-title">Generator</div>
                        <div class="component-desc">Produces reasoning trajectories for new queries, surfacing effective strategies and pitfalls</div>
                    </div>
                    
                    <div class="arrow">
                        <i class="material-icons">arrow_forward</i>
                    </div>
                    
                    <div class="component">
                        <div class="component-title">Reflector</div>
                        <div class="component-desc">Critiques generated traces, distilling insights from successes and errors</div>
                    </div>
                    
                    <div class="arrow">
                        <i class="material-icons">arrow_forward</i>
                    </div>
                    
                    <div class="component">
                        <div class="component-title">Curator</div>
                        <div class="component-desc">Synthesizes insights into structured "delta entries" and integrates them into existing context</div>
                    </div>
                </div>
            </div>
        </div>
        
        <div class="section">
            <h3 class="section-title">
                <i class="material-icons">lightbulb</i>
                Key Innovations
            </h3>
            <div class="section-content">
                <div class="solution-box">
                    <h4 style="color: #3949ab; margin-bottom: 10px;">Incremental Delta Updates</h4>
                    <ul class="bullet-list">
                        <li>Contexts represented as structured, itemized "bullets" with metadata and content</li>
                        <li>Small, localized edits preserve prior knowledge while accumulating new insights</li>
                        <li>Non-LLM logic for deterministic merging, de-duplication, and pruning</li>
                    </ul>
                </div>
                
                <div class="solution-box">
                    <h4 style="color: #3949ab; margin-bottom: 10px;">Grow-and-Refine Mechanism</h4>
                    <ul class="bullet-list">
                        <li>Balances context expansion with periodic refinement</li>
                        <li>Maintains relevance and prevents unbounded growth</li>
                        <li>Enables efficient, parallel merging crucial for scalability</li>
                    </ul>
                </div>
            </div>
        </div>
        
        <div class="section">
            <h3 class="section-title">
                <i class="material-icons">trending_up</i>
                Performance Results
            </h3>
            <div class="section-content">
                <p>ACE consistently outperforms strong baselines across agent and domain-specific benchmarks:</p>
                
                <div class="results-container">
                    <div class="result-box">
                        <div class="result-number">+10.6%</div>
                        <div class="result-label">Agent Tasks (AppWorld)</div>
                    </div>
                    
                    <div class="result-box">
                        <div class="result-number">+8.6%</div>
                        <div class="result-label">Financial Analysis (FiNER + XBRL)</div>
                    </div>
                </div>
                
                <p>Matches top-ranked production-level agent on AppWorld leaderboard using smaller open-source model.</p>
                
                <img src="https://sfile.chatglm.cn/moeSlide/image/2d/2dc07c66.jpg" alt="Context-Quality Curve" class="context-image">
            </div>
        </div>
        
        <div class="section">
            <h3 class="section-title">
                <i class="material-icons">speed</i>
                Efficiency Gains
            </h3>
            <div class="section-content">
                <p>ACE achieves significant efficiency improvements compared to existing methods:</p>
                
                <table class="efficiency-table">
                    <tr>
                        <th>Metric</th>
                        <th>Offline vs GEPA</th>
                        <th>Online vs Dynamic Cheatsheet</th>
                    </tr>
                    <tr>
                        <td>Latency Reduction</td>
                        <td>82.3%</td>
                        <td>91.5%</td>
                    </tr>
                    <tr>
                        <td>Rollout/Token Cost Reduction</td>
                        <td>75.1%</td>
                        <td>83.6%</td>
                    </tr>
                </table>
                
                <p>Adapts effectively <span class="highlight">without labeled supervision</span> by leveraging natural execution feedback.</p>
            </div>
        </div>
        
        <div class="section">
            <h3 class="section-title">
                <i class="material-icons">insights</i>
                Implications
            </h3>
            <div class="section-content">
                <ul class="bullet-list">
                    <li>Enables scalable, efficient, and self-improving LLM systems with low overhead</li>
                    <li>Provides interpretable contexts and lower overhead compared to fine-tuning</li>
                    <li>Offers a flexible approach for online and continuous learning</li>
                    <li>Particularly valuable for specialized domains and long-context applications</li>
                </ul>
            </div>
        </div>
        
        <div class="footer">
            <p>arXiv:2510.04618 | Code available at github.com/ace-agent/ace</p>
        </div>
    </div>
</body>
</html>                    
讨论回复

0 条回复
还没有人回复，快来发表你的看法吧！
需要登录才能发表回复
登录注册
Agentic Context Engineering Evolving Contexts for Self-Improving Language Models

讨论回复

推荐