Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight Decay

要約

機械学習方法は一般に逆の問題を解決するために使用されます。ここでは、既知の取得手順を介して生成された少数の測定から未知の信号を推定する必要があります。
特に、ニューラルネットワークは経験的に十分に機能しますが、理論的保証は限られています。
この作業では、いくつかの可能なソリューションマッピングを認める未定で決定された線形逆問題を研究します。
ソリューションマッピングの一意性を確立する標準的な治療法（たとえば、圧縮センシングなど）は、ソース信号の潜在的な低次元構造の知識を引き受けることです。
次の質問をします。深いニューラルネットワークは、体重減衰の正則化を伴う勾配降下によって訓練されたときに、この低次元構造に適応しますか？
この方法で訓練された軽度のオーバーパラメーター化された深い線形ネットワークは、潜在的なサブスペース構造を暗黙的にエンコードしながら、逆問題を正確に解決する近似ソリューションに収束することを証明します。
私たちの知る限り、これは、体重減衰で訓練された深い線形ネットワークが、実際の段階的および重量初期化スキームの下でデータの潜在サブスペース構造に自動的に適応することを厳密に示した最初の結果です。
私たちの仕事は、正則化とオーバーパラメーター化が一般化を改善し、オーバーパラメーター化がトレーニング中の収束も加速することを強調しています。

要約(オリジナル)

Machine learning methods are commonly used to solve inverse problems, wherein an unknown signal must be estimated from few measurements generated via a known acquisition procedure. In particular, neural networks perform well empirically but have limited theoretical guarantees. In this work, we study an underdetermined linear inverse problem that admits several possible solution mappings. A standard remedy (e.g., in compressed sensing) establishing uniqueness of the solution mapping is to assume knowledge of latent low-dimensional structure in the source signal. We ask the following question: do deep neural networks adapt to this low-dimensional structure when trained by gradient descent with weight decay regularization? We prove that mildly overparameterized deep linear networks trained in this manner converge to an approximate solution that accurately solves the inverse problem while implicitly encoding latent subspace structure. To our knowledge, this is the first result to rigorously show that deep linear networks trained with weight decay automatically adapt to latent subspace structure in the data under practical stepsize and weight initialization schemes. Our work highlights that regularization and overparameterization improve generalization, while overparameterization also accelerates convergence during training.

arxiv情報

著者	Hannah Laus,Suzanna Parkinson,Vasileios Charisopoulos,Felix Krahmer,Rebecca Willett
発行日	2025-05-28 17:25:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight Decay

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー