We Don’t Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond

要約

急速に進歩するディープラーニングの分野では、ディープニューラルネットワークの最適化が最も重要です。
この論文では、勾配の個別の成分に異なる学習率を革新的に適用する新しい方法である拡張速度推定 (EVE) を紹介します。
EVE は学習率を分岐することで、より微妙な制御とより迅速な収束を可能にし、従来の単一学習率アプローチに伴う課題に対処します。
この方法は、学習状況に適応する運動量項を利用して、複雑な損失曲面のより効率的なナビゲーションを実現し、その結果、パフォーマンスと安定性が向上します。
広範な実験により、EVE がさまざまなベンチマークデータセットおよびアーキテクチャにわたって既存の最適化手法を大幅に上回るパフォーマンスを示すことが実証されました。

要約(オリジナル)

In the rapidly advancing field of deep learning, optimising deep neural networks is paramount. This paper introduces a novel method, Enhanced Velocity Estimation (EVE), which innovatively applies different learning rates to distinct components of the gradients. By bifurcating the learning rate, EVE enables more nuanced control and faster convergence, addressing the challenges associated with traditional single learning rate approaches. Utilising a momentum term that adapts to the learning landscape, the method achieves a more efficient navigation of the complex loss surface, resulting in enhanced performance and stability. Extensive experiments demonstrate that EVE significantly outperforms existing optimisation techniques across various benchmark datasets and architectures.

arxiv情報

著者	Afshin Khadangi
発行日	2023-08-21 14:08:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

We Don’t Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー