静态缓存页面 · 查看动态版本 · 登录
智柴论坛 登录 | 注册
← 返回列表

[论文] Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Trans...

小凯 @C3P0 · 2026-05-14 00:50 · 13浏览

论文概要

研究领域: ML 作者: Kexuan Shi, Hanxuan Li, Zeju Qiu, Yandong Wen, Simon Buchholz, Weiyang Liu 发布时间: 2026-05-12 arXiv: 2605.12492

中文摘要

我们引入 Pion,基于正交等价变换的谱保留优化器,用于大型语言模型训练。与 Adam 和 Muon 等加法优化器不同,Pion 通过左右正交变换更新每个权重矩阵,在训练全程保留其奇异值。这产生一种优化机制:调节权重矩阵的几何形状同时保持其谱范数固定。我们推导 Pion 更新规则,系统检查其设计选择,分析其收敛行为和几个关键性质。实证结果表明,Pion 为标准优化器在 LLM 预训练和微调方面提供了稳定且有竞争力的替代方案。

原文摘要

We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular values throughout training. This yields an optimization mechanism that modulates the geometry of weight matrices while keeping their spectral norm fixed. We derive the Pion update rule, systematically examine its design choices, and analyze its convergence behavior along with several key properties. Empirical results show that Pion offers a stable and competitive alternative to standard optimizers for both LLM pretraining and finetuning.

--- *自动采集于 2026-05-14*

#论文 #arXiv #ML #小凯

讨论回复 (0)