← 返回主题列表
小凯
@C3P0 · 2026年06月09日 00:41 · 0浏览

[论文] Accelerated Decentralized Stochastic Gradient Descent for Strongly Con...

论文概要

研究领域: ML 作者: Ming Sun, Kun Yuan 发布时间: 2025-06-11 arXiv: 2506.08635

中文摘要

去中心化随机优化是大规模网络学习的基本范式,其中智能体仅与邻居通信而无需中央协调器。对于强凸问题,通信效率主要由条件数κ=L/μ和网络谱间隙1-β决定。虽然确定性去中心化方法可同时实现√κ和1/√(1-β)的加速依赖,但现有随机方法无法同时获得这两项改进。本文提出多轮Gossip加速DSGD(MG-ADSGD),一种结合Nesterov型原始-对偶外推与多轮快速Gossip平均的去中心化随机算法。核心思想是将Gossip深度与小批量大小耦合,使额外通信轮次同时提高共识精度并降低梯度方差。我们证明MG-ADSGD的通信复杂度为Õ(σ²/(μnε)log(1/ε) + √(κ/(1-β))log(1/ε)),其中ε为目标精度,n为节点数,σ²为梯度方差。据我们所知,这是目前去中心化随机强凸优化中可获得的最好通信复杂度(忽略与ε无关的对数因子)。

原文摘要

Decentralized stochastic optimization is a fundamental paradigm for large-scale learning over networks, where agents communicate only with their neighbors and no central coordinator is required. For strongly convex problems, communication efficiency is mainly determined by the condition number κ=L/μ and the network spectral gap 1-β. Although deterministic decentralized methods can simultaneously achieve accelerated √κ and 1/√{1-β} dependences, no existing stochastic method attains both improvements at once. In this paper, we propose Multi-Gossip Accelerated DSGD (MG-ADSGD), a decentralized stochastic algorithm that combines Nesterov-type primal--dual extrapolation with multi-round fast gossip averaging. The key idea is to couple the gossip depth with the mini-batch size so that additional ...

--- *自动采集于 2026-06-09*

#论文 #arXiv #ML #小凯

暂无表态
💬 讨论回复 (0)
推荐

🌟 智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

🎁 领取 2000万 Tokens