Remove Symmetries to Control Model Expressivity and Improve Optimization

要約

対称性が損失関数に存在する場合、モデルは「崩壊」として知られる場合がある低容量状態に閉じ込められる可能性があります。
これらの低容量の状態に閉じ込められていることは、深い学習技術が適用される多くのシナリオでトレーニングの大きな障害となる可能性があります。
最初に、対称性が能力の低下につながり、トレーニングと推論中に特徴を無視する2つの具体的なメカニズムを証明します。
次に、シンプルで理論的に正当化されたアルゴリズムであるSyreを提案して、ニューラルネットワークでほぼすべての対称性誘導性低容量状態を除去します。
このタイプの閉じ込めが特に懸念事項である場合、提案された方法との対称性を除去することは、最適化またはパフォーマンスの改善とよく相関することが示されています。
提案された方法の顕著なメリットは、それがモデルに依存しており、対称性の知識を必要としないことです。

要約(オリジナル)

When symmetry is present in the loss function, the model is likely to be trapped in a low-capacity state that is sometimes known as a ‘collapse’. Being trapped in these low-capacity states can be a major obstacle to training across many scenarios where deep learning technology is applied. We first prove two concrete mechanisms through which symmetries lead to reduced capacities and ignored features during training and inference. We then propose a simple and theoretically justified algorithm, syre, to remove almost all symmetry-induced low-capacity states in neural networks. When this type of entrapment is especially a concern, removing symmetries with the proposed method is shown to correlate well with improved optimization or performance. A remarkable merit of the proposed method is that it is model-agnostic and does not require any knowledge of the symmetry.

arxiv情報

著者	Liu Ziyin,Yizhou Xu,Isaac Chuang
発行日	2025-02-27 15:30:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Remove Symmetries to Control Model Expressivity and Improve Optimization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー