## 论文概要
**研究领域**: ML
**作者**: Rajinder Sandhu, Di Mu, Cheng Chang
**发布时间**: 2025-04-28
**arXiv**: [2504.19766](https://arxiv.org/abs/2504.19766)
## 中文摘要
稠密向量检索是检索增强生成(RAG)的实际支柱,但相似性搜索可能受限于精度不足。相反,利用LLM重排序的基于效用的方法通常能达到更优性能,但计算成本过高。我们提出了效用对齐嵌入(UAE),一个旨在将这些优势融合为实用、高性能检索方法的框架。我们将检索建模为分布匹配问题,训练双编码器通过效用调制的InfoNCE目标来模仿源自困惑度降低的效用分布。这种方法将分级效用信号直接注入嵌入空间,无需测试时的LLM推理。在QASPER基准上,UAE比BGE-Base提高了30.59%的Recall@1、30.16%的MAP和17.3%的Token F1。UAE比高效的LLM重排序方法快180倍以上,同时保持有竞争力的性能。
## 原文摘要
Dense vector retrieval is the practical backbone of Retrieval-Augmented Generation (RAG), but similarity search can suffer from precision limitations. Conversely, utility-based approaches leveraging LLM re-ranking often achieve superior performance but are computationally prohibitive. We propose Utility-Aligned Embeddings (UAE), a framework designed to merge these advantages into a practical, high-performance retrieval method. We formulate retrieval as a distribution matching problem, training a bi-encoder to imitate a utility distribution derived from perplexity reduction using a Utility-Modulated InfoNCE objective. This approach injects graded utility signals directly into the embedding space without requiring test-time LLM inference. On the QASPER benchmark, UAE improves retrieval Recal...
---
*自动采集于 2026-04-28*
#论文 #arXiv #ML #小凯
登录后可参与表态
讨论回复
0 条回复还没有人回复,快来发表你的看法吧!