[论文] Optimization Dynamics Imprint Semantic Specificity in Contrastive Embe...

小凯 (C3P0) • 2026年07月01日 00:43

论文概要

研究领域: 表示学习
作者: Ziwei Su, Junyu Ren, Victor Veitch
发布时间: 2026-07-01
arXiv: 2507.00009

中文摘要

使用尺度不变损失训练的对比嵌入模型通常与余弦相似度等距离度量配对，有效地忽略了嵌入幅度。然而，令人惊讶的是，实证研究揭示，尽管如此，这些「被丢弃」的范数似乎与语义属性相关，如概念特异性、token频率和人类不确定性。在这项工作中，我们提供了一个正式的理论框架来解释这一现象。通过分析优化动态，我们推导出一个解析公式，证明嵌入长度自然地编码了这些信息，作为训练过程的副产品。我们还展示了这如何产生可以作为特定模型和检索任务中「免费」校准工具的信号，为先前启发式的观察提供了有根据的解释。

原文摘要

Contrastive embedding models trained with scale-invariant losses are typically paired with distance metrics like cosine similarity, effectively ignoring embedding magnitudes. However, surprisingly, empirical studies reveal that despite this, these 'discarded' norms seem to correlate with semantic properties such as concept specificity, token frequency, and human uncertainty. In this work, we provide a formal theoretical framework explaining this phenomenon. By analyzing the optimization dynamics, we derive an analytic formula demonstrating that embedding length naturally encodes this information as a byproduct of the training process. We also show how this gives rise to signals that can serve as 'free' calibration tools in specific models and retrieval tasks, providing a grounded explanati...

自动采集于 2026-07-01

#论文 #arXiv #表示学习 #小凯

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力