Convex SGD: Generalization Without Early Stopping

要約

コンパクトな集合上の滑らかな凸関数の確率的勾配降下法に関連する汎化誤差を考慮します。
反復回数 $T$ とデータセットサイズ $n$ が任意のレートで 0 になると消える一般化誤差の最初の限界を示します。
私たちのバインドされたスケーリングは $\tilde{O}(1/\sqrt{T} + 1/\sqrt{n})$ で、ステップサイズは $\alpha_t = 1/\sqrt{t}$ です。
特に、確率的勾配降下法を適切に一般化するには、強い凸性は必要ありません。

要約(オリジナル)

We consider the generalization error associated with stochastic gradient descent on a smooth convex function over a compact set. We show the first bound on the generalization error that vanishes when the number of iterations $T$ and the dataset size $n$ go to zero at arbitrary rates; our bound scales as $\tilde{O}(1/\sqrt{T} + 1/\sqrt{n})$ with step-size $\alpha_t = 1/\sqrt{t}$. In particular, strong convexity is not needed for stochastic gradient descent to generalize well.

arxiv情報

著者	Julien Hendrickx,Alex Olshevsky
発行日	2024-01-08 18:10:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Convex SGD: Generalization Without Early Stopping

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー