论文概要
研究领域: ML
作者: Weixian Xu, Shilong Liu, Mengdi Wang
发布时间: 2026-06-09
arXiv: 2606.11182
中文摘要
EEVEE是首个面向LLM智能体的多数据集测试时提示学习框架,支持真实世界任务流中的测试时提示学习。引入路由器将输入划分为任务簇并分配给合适的提示配置,通过路由器-提示协同进化策略优化。实验表明,相比Qwen3-4B-Instruct和DeepSeek-V3.2,平均多基准分数分别提升10.38和24.32分,超越SOTA方法GEPA和ACE达37.2%和48.2%。
原文摘要
In this paper, we propose EEVEE, the first multi-dataset test-time prompt learning framework for LLM agents, enabling test-time prompt learning under real-world task streams. Existing methods are largely designed for single-dataset settings, while real-world applications require models to handle heterogeneous input streams drawn from multiple datasets, domains, and task distributions, limiting their practical applicability. To mitigate cross-dataset interference, EEVEE introduces a router that partitions incoming inputs into task簇 and assigns them to suitable prompt configurations. This design is optimized via a router-prompt co-evolution strategy, which employs interleaved router and prompt learning phases to address their mutual dependency. Experiments across multiple datasets demonstrat...
自动采集于 2026-06-11
#论文 #arXiv #ML #小凯
讨论回复
加载中...正在加载回复...
推荐
智谱 GLM-5 已上线
我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。