Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize

要約

分布の変化の下で一般化する学習表現は、堅牢な機械学習モデルを構築するために重要です。
しかし、近年の多大な努力にもかかわらず、この方向におけるアルゴリズムの進歩は限られています。
この研究では、ディープニューラルネットワークを使用した分布外一般化の根本的な困難を理解しようとします。
私たちはまず、おそらく驚くべきことに、分布外を一般化できる教師ネットワークから得られた表現にニューラルネットワークが明示的に適合することを許可するだけでも、生徒ネットワークの一般化には不十分であることを経験的に示します。
次に、構造化特徴モデルの下で確率的勾配降下法 (SGD) によって最適化された 2 層 ReLU ネットワークの理論的研究により、ニューラルネットワークの基本的かつ未解明の特徴学習傾向、特徴コンタミネーションを特定します。ニューラルネットワークは、相関のない特徴を一緒に学習できます。
予測特徴により、分布の変化の下で汎化の失敗が生じます。
注目すべきことに、このメカニズムは、一般化の失敗を偽の相関関係に帰する文献で一般的な説とは本質的に異なります。
全体として、私たちの結果はニューラルネットワークの非線形特徴学習ダイナミクスに対する新たな洞察を提供し、分布外一般化における帰納的バイアスを考慮する必要性を強調しています。

要約(オリジナル)

Learning representations that generalize under distribution shifts is critical for building robust machine learning models. However, despite significant efforts in recent years, algorithmic advances in this direction have been limited. In this work, we seek to understand the fundamental difficulty of out-of-distribution generalization with deep neural networks. We first empirically show that perhaps surprisingly, even allowing a neural network to explicitly fit the representations obtained from a teacher network that can generalize out-of-distribution is insufficient for the generalization of the student network. Then, by a theoretical study of two-layer ReLU networks optimized by stochastic gradient descent (SGD) under a structured feature model, we identify a fundamental yet unexplored feature learning proclivity of neural networks, feature contamination: neural networks can learn uncorrelated features together with predictive features, resulting in generalization failure under distribution shifts. Notably, this mechanism essentially differs from the prevailing narrative in the literature that attributes the generalization failure to spurious correlations. Overall, our results offer new insights into the non-linear feature learning dynamics of neural networks and highlight the necessity of considering inductive biases in out-of-distribution generalization.

arxiv情報

著者	Tianren Zhang,Chujie Zhao,Guanyu Chen,Yizhou Jiang,Feng Chen
発行日	2024-06-06 09:45:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー