Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence

要約

分散最適化は機械学習トレーニングを高速化する標準的な方法であり、この分野の研究のほとんどは分散一次の勾配ベースの手法に焦点を当てています。
ただし、一部の計算境界ノードが共同最適化タスクに貢献できるにもかかわらず、一次の勾配ベースの最適化を実装できない可能性がある設定があります。
この論文では、ハイブリッド分散最適化の研究を開始し、0 次と 1 次の最適化機能を持つノードが分散システム内に共存する設定を研究し、データ分散に対する最適化タスクを共同で解決することを試みます。
我々は基本的に、合理的なパラメータ設定の下では、そのようなシステムはノイズの多い0次エージェントに耐えられるだけでなく、そのようなエージェントの情報を無視するのではなく、最適化プロセスに統合することで恩恵を受けることさえできることを示します。
私たちのアプローチの中核となるのは、ノイズが多くバイアスがかかっている可能性のある勾配推定器を使用した分散最適化の新しい分析であり、これは独立して興味深いものになる可能性があります。
私たちの結果は、凸型対物レンズと非凸型対物レンズの両方に当てはまります。
標準的な最適化タスクに関する実験結果は私たちの分析を裏付けており、ディープニューラルネットワークをトレーニングする場合でも、1 次と 0 次のハイブリッド最適化が実用的であることが示されています。

要約(オリジナル)

Distributed optimization is the standard way of speeding up machine learning training, and most of the research in the area focuses on distributed first-order, gradient-based methods. Yet, there are settings where some computationally-bounded nodes may not be able to implement first-order, gradient-based optimization, while they could still contribute to joint optimization tasks. In this paper, we initiate the study of hybrid decentralized optimization, studying settings where nodes with zeroth-order and first-order optimization capabilities co-exist in a distributed system, and attempt to jointly solve an optimization task over some data distribution. We essentially show that, under reasonable parameter settings, such a system can not only withstand noisier zeroth-order agents but can even benefit from integrating such agents into the optimization process, rather than ignoring their information. At the core of our approach is a new analysis of distributed optimization with noisy and possibly-biased gradient estimators, which may be of independent interest. Our results hold for both convex and non-convex objectives. Experimental results on standard optimization tasks confirm our analysis, showing that hybrid first-zeroth order optimization can be practical, even when training deep neural networks.

arxiv情報

著者	Matin Ansaripour,Shayan Talaei,Giorgi Nadiradze,Dan Alistarh
発行日	2024-09-04 17:45:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー