Multi-layer random features and the approximation power of neural networks

要約

ランダムに初期化された重みを持つニューラルアーキテクチャは、無限幅制限内で、共分散関数がいわゆるニューラルネットワークガウスプロセスカーネル (NNGP) であるガウスランダムフィールドと同等です。
NNGP によって定義された再現カーネルヒルベルト空間 (RKHS) には、アーキテクチャによって近似できる関数のみが含まれていることを証明します。
一定の近似誤差を達成するために、各層に必要なニューロンの数は、ターゲット関数の RKHS ノルムによって定義されます。
さらに、入力ベクトルのランダムな多層表現と最後の層の重みのトレーニングによって、教師付きデータセットから近似を構築できます。
2 層 NN と ${\mathbb R}^n$ の $n-1$ 次元の球に等しい領域について、バロンの定理と多層特徴の構築によって必要なニューロンの数を比較します。
NNGP の積分演算子の固有値が $k^{-n-\frac{2}{3}}$ ($k$ は固有値の次数) よりも遅く減衰する場合、私たちの定理はより簡潔な値を保証することを示します。
バロンの定理よりもニューラルネットワーク近似。
また、理論的発見を検証するためにいくつかの計算実験も行います。
私たちの実験は、両方の定理が何の保証も与えない場合でも、現実的なニューラルネットワークがターゲット関数を容易に学習することを示しています。

要約(オリジナル)

A neural architecture with randomly initialized weights, in the infinite width limit, is equivalent to a Gaussian Random Field whose covariance function is the so-called Neural Network Gaussian Process kernel (NNGP). We prove that a reproducing kernel Hilbert space (RKHS) defined by the NNGP contains only functions that can be approximated by the architecture. To achieve a certain approximation error the required number of neurons in each layer is defined by the RKHS norm of the target function. Moreover, the approximation can be constructed from a supervised dataset by a random multi-layer representation of an input vector, together with training of the last layer’s weights. For a 2-layer NN and a domain equal to an $n-1$-dimensional sphere in ${\mathbb R}^n$, we compare the number of neurons required by Barron’s theorem and by the multi-layer features construction. We show that if eigenvalues of the integral operator of the NNGP decay slower than $k^{-n-\frac{2}{3}}$ where $k$ is an order of an eigenvalue, then our theorem guarantees a more succinct neural network approximation than Barron’s theorem. We also make some computational experiments to verify our theoretical findings. Our experiments show that realistic neural networks easily learn target functions even when both theorems do not give any guarantees.

arxiv情報

著者	Rustem Takhanov
発行日	2024-04-26 14:57:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multi-layer random features and the approximation power of neural networks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー