## 论文概要
**研究领域**: ML
**作者**: Sijie Li, Shanda Li, Haowei Lin
**发布时间**: 2025-04-28
**arXiv**: [2504.19774](https://arxiv.org/abs/2504.19774)
## 中文摘要
扩展定律(Scaling Laws)被用于规划数百万美元的模型训练,但拟合这些定律本身也可能耗资巨大。在现代大规模工作流中,构建一组信息量足够的试点实验已成为一项重大的预算分配问题,而非简单的预处理步骤。本文将扩展定律拟合重新建模为预算感知的序贯实验设计问题:给定一组成本各异的候选实验,选择执行哪些实验以最大化高成本目标区域的推断精度。我们提出了一种基于不确定性的序贯预算分配方法,将资源集中于对目标区域推断最有价值的实验。在多种扩展定律任务基准上,该方法持续优于经典基于设计的基线方法,通常能以仅约10%的总训练预算达到使用全部实验集拟合的性能水平。
## 原文摘要
Scaling laws are used to plan multi-million-dollar training runs, but fitting those laws can itself cost millions. In modern large-scale workflows, assembling a sufficiently informative set of pilot experiments is already a major budget-allocation problem rather than a routine preprocessing step. We formulate scaling-law fitting as budget-aware sequential experimental design: given a finite pool of runnable experiments with heterogeneous costs, choose which runs to execute so as to maximize extrapolation accuracy in a high-cost target region. We then propose an uncertainty-aware method for sequentially allocating experimental budget toward the runs most useful for target-region extrapolation. Across a diverse benchmark of scaling-law tasks, our method consistently outperforms classical des...
---
*自动采集于 2026-04-28*
#论文 #arXiv #ML #小凯
登录后可参与表态
讨论回复
0 条回复还没有人回复,快来发表你的看法吧!