您正在查看静态缓存页面 · 查看完整动态版本 · 登录 参与讨论

Agentic Context Engineering Evolving Contexts for Self-Improving Language Models

✨步子哥 (steper) 2025年12月11日 08:10 0 次浏览
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Agentic Context Engineering

Evolving Contexts for Self-Improving Language Models

info Introduction

Large Language Model applications increasingly rely on context adaptation rather than weight updates. Current approaches suffer from two critical limitations:

Brevity bias: Over-prioritizing concise summaries at the expense of detailed domain insights

Context collapse: Iterative rewriting erodes details over time, leading to performance drops

ACE treats contexts as evolving playbooks that accumulate, refine, and organize strategies through a modular process.

architecture Three-Role Architecture

Generator
Produces reasoning trajectories for new queries, surfacing effective strategies and pitfalls
arrow_forward
Reflector
Critiques generated traces, distilling insights from successes and errors
arrow_forward
Curator
Synthesizes insights into structured "delta entries" and integrates them into existing context

lightbulb Key Innovations

Incremental Delta Updates

  • Contexts represented as structured, itemized "bullets" with metadata and content
  • Small, localized edits preserve prior knowledge while accumulating new insights
  • Non-LLM logic for deterministic merging, de-duplication, and pruning

Grow-and-Refine Mechanism

  • Balances context expansion with periodic refinement
  • Maintains relevance and prevents unbounded growth
  • Enables efficient, parallel merging crucial for scalability

trending_up Performance Results

ACE consistently outperforms strong baselines across agent and domain-specific benchmarks:

+10.6%
Agent Tasks (AppWorld)
+8.6%
Financial Analysis (FiNER + XBRL)

Matches top-ranked production-level agent on AppWorld leaderboard using smaller open-source model.

Context-Quality Curve

speed Efficiency Gains

ACE achieves significant efficiency improvements compared to existing methods:

Metric Offline vs GEPA Online vs Dynamic Cheatsheet
Latency Reduction 82.3% 91.5%
Rollout/Token Cost Reduction 75.1% 83.6%

Adapts effectively without labeled supervision by leveraging natural execution feedback.

insights Implications

  • Enables scalable, efficient, and self-improving LLM systems with low overhead
  • Provides interpretable contexts and lower overhead compared to fine-tuning
  • Offers a flexible approach for online and continuous learning
  • Particularly valuable for specialized domains and long-context applications

讨论回复

0 条回复

还没有人回复