Analyzing the Role of Permutation Invariance in Linear Mode Connectivity

要約

Entezariらで経験的に観察されました。
（2021）ニューラルネットワークの順列不変性を考慮している場合、2つのSGDソリューション間の線形補間に沿って損失障壁がない可能性があります。
この現象は、その理論的関心とモデルのマージなどのアプリケーションにおける実際的な関連性の両方のために、大きな注目を集めています。
このホワイトペーパーでは、教師と学生のセットアップの下で、2層のレリューネットワークのこの現象の細粒分析を提供します。
学生ネットワークの幅$ m $が増加すると、LMC損失障壁モジュロ順列が二重降下挙動を示すことを示します。
特に、$ m $が十分に大きい場合、$ o（m^{-1/2}）$でバリアがゼロに減少します。
特に、このレートは次元の呪いに悩まされておらず、順列がLMC損失障壁をどのように減らすことができるかを示しています。
さらに、学習速度を上げるときにGD/SGDソリューションのスパース性の急激な遷移を観察し、このスパースの好みがLMC損失障壁測定値にどのように影響するかを調査します。
合成データセットとMNISTデータセットの両方での実験は、理論的予測を裏付け、より複雑なネットワークアーキテクチャの同様の傾向を明らかにします。

要約(オリジナル)

It was empirically observed in Entezari et al. (2021) that when accounting for the permutation invariance of neural networks, there is likely no loss barrier along the linear interpolation between two SGD solutions — a phenomenon known as linear mode connectivity (LMC) modulo permutation. This phenomenon has sparked significant attention due to both its theoretical interest and practical relevance in applications such as model merging. In this paper, we provide a fine-grained analysis of this phenomenon for two-layer ReLU networks under a teacher-student setup. We show that as the student network width $m$ increases, the LMC loss barrier modulo permutation exhibits a double descent behavior. Particularly, when $m$ is sufficiently large, the barrier decreases to zero at a rate $O(m^{-1/2})$. Notably, this rate does not suffer from the curse of dimensionality and demonstrates how substantial permutation can reduce the LMC loss barrier. Moreover, we observe a sharp transition in the sparsity of GD/SGD solutions when increasing the learning rate and investigate how this sparsity preference affects the LMC loss barrier modulo permutation. Experiments on both synthetic and MNIST datasets corroborate our theoretical predictions and reveal a similar trend for more complex network architectures.

arxiv情報

著者	Keyao Zhan,Puheng Li,Lei Wu
発行日	2025-03-12 16:22:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Analyzing the Role of Permutation Invariance in Linear Mode Connectivity

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー