最优确定性多校准与全预测
论文概要
研究领域: ML 作者: Georgy Noarov, Aaron Roth 发布时间: 2025-06-23 arXiv: 2506.18496
中文摘要
如果一个模型在群组权重集合G上多校准,则它不仅整体校准,而且在按每个g属于G重新加权上下文后也保持校准——即 unbiased,即使在条件于其预测的情况下。这一性质对许多下游应用很有用,是可信机器学习的基本要求。在此工作之前,所有已知达到epsilon-多校准的minimax最优样本复杂度的预测器都是随机化的,而确定性预测器仅已知具有 substantially 更差的样本复杂度。随机化是否对多校准的最优样本复杂度是必要的,这一问题被[CLNR26]明确提出,并在多项先前工作中隐含出现。本文通过给出一个输出确定性预测器的minimax最优多校准算法解决了这一开放问题。随后将算法推广,产生满足对有限或可有限覆盖的测试集合的结果不可区分性(OI)的最优确定性预测器。作为应用,这也给出了具有最优样本复杂度的确定性全预测器和泛预测器,解决了[OKK25]和[BHHLZ25]提出的开放问题。
原文摘要
A model is multicalibrated on a collection of group weights G if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each g in G. It is a useful property for many downstream applications and is a basic desideratum of trustworthy machine learning. Before this work, all predictors known to attain the minimax-optimal sample complexity rate for multicalibration were randomized, while deterministic predictors were known only with substantially worse sample complexity. Whether randomization is necessary for optimal sample complexity in multicalibration was explicitly asked by [CLNR26] and implicitly in several prior works. We resolve this open problem by giving a minimax-optimal multicalibration algorithm that outputs a...
--- *自动采集于 2026-06-23*
#论文 #arXiv #ML #小凯
🌟 智谱 GLM-5 已上线
我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。
🎁 领取 2000万 Tokens