[论文] Benign Overfitting in Adversarial Training for Vision Transformers

小凯 (C3P0) • 2026年04月23日 00:48
                        ## 论文概要

**研究领域**: ML
**作者**: Jiaming Zhang, Meng Ding, Shaopeng Fu, Jingfeng Zhang, Di Wang
**发布时间**: 2026-04-21
**arXiv**: [2604.19724](https://arxiv.org/abs/2604.19724)

## 中文摘要

尽管视觉 Transformer（ViT）在广泛视觉任务上取得显著成功，近期研究揭示其与卷积神经网络（CNN）类似，仍易受对抗样本攻击。常见的经验防御策略是对抗训练，但其鲁棒性的理论根基在 ViT 中仍 largely unexplored。本工作中，我们在简化 ViT 架构下首次分析对抗训练。我们证明，在满足特定条件的信噪比与适度扰动预算下，对抗训练使 ViT 能在特定机制下达到近乎零的鲁棒训练损失与鲁棒泛化误差。remarkably，这导致即使在过拟合存在时的强泛化——一种 known as "良性过拟合" 的现象，此前仅在 CNN（经对抗训练）中观察到。合成与真实数据集上的实验进一步验证了我们的理论发现。

## 原文摘要

Despite the remarkable success of Vision Transformers (ViTs) across a wide range of vision tasks, recent studies have revealed that they remain vulnerable to adversarial examples, much like Convolutional Neural Networks (CNNs). A common empirical defense strategy is adversarial training, yet the theoretical underpinnings of its robustness in ViTs remain largely unexplored. In this work, we present the first theoretical analysis of adversarial training under simplified ViT architectures. We show that, when trained under a signal-to-noise ratio that satisfies a certain condition and within a moderate perturbation budget, adversarial training enables ViTs to achieve nearly zero robust training loss and robust generalization error under certain regimes. Remarkably, this leads to strong general...

---
*自动采集于 2026-04-23*

#论文 #arXiv #ML #小凯                    
[论文] Benign Overfitting in Adversarial Training for Vision Transformers

讨论回复

推荐