Geometry and Optimization of Shallow Polynomial Networks

要約

私たちは多項式活性化を伴う浅いニューラルネットワークを研究します。
これらのモデルの関数空間は、ランクが制限された対称テンソルのセットで識別できます。
幅と最適化の関係に焦点を当てて、これらのネットワークの一般的な特徴について説明します。
次に、教師と生徒の問題を検討します。これは、データ分布によって引き起こされる非標準内積に関する低ランクのテンソル近似の問題と見なすことができます。
この設定では、最適化の定性的な動作をトレーニングデータの分布の関数としてエンコードする教師計量判別式を導入します。
最後に、二次活性化を伴うネットワークに焦点を当て、最適化状況の詳細な分析を示します。
特に、二次ネットワークとガウストレーニングデータを使用した教師と生徒の問題に対するすべての臨界点とそのヘシアン署名を特徴付けるエッカートヤング定理のバリエーションを示します。

要約(オリジナル)

We study shallow neural networks with polynomial activations. The function space for these models can be identified with a set of symmetric tensors with bounded rank. We describe general features of these networks, focusing on the relationship between width and optimization. We then consider teacher-student problems, that can be viewed as a problem of low-rank tensor approximation with respect to a non-standard inner product that is induced by the data distribution. In this setting, we introduce a teacher-metric discriminant which encodes the qualitative behavior of the optimization as a function of the training data distribution. Finally, we focus on networks with quadratic activations, presenting an in-depth analysis of the optimization landscape. In particular, we present a variation of the Eckart-Young Theorem characterizing all critical points and their Hessian signatures for teacher-student problems with quadratic networks and Gaussian training data.

arxiv情報

著者	Yossi Arjevani,Joan Bruna,Joe Kileel,Elzbieta Polak,Matthew Trager
発行日	2025-01-10 16:11:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Geometry and Optimization of Shallow Polynomial Networks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー