Scalable Generalized Bayesian Online Neural Network Training for Sequential Decision Making

要約

オンライン学習のためのスケーラブルなアルゴリズムと、シーケンシャルな意思決定タスクのために設計されたニューラルネットワークパラメーターの一般化ベイジアン推論を紹介します。
私たちの方法は、パラメーター誤差共分散のブロック二角近似を介した高速低ランクの更新と、意思決定に使用する明確に定義された事後予測分布を含む、頻繁な頻度とベイジアンフィルタリングの強度を組み合わせています。
より正確には、主な方法では、隠されたレイヤーパラメーターの低ランクエラー共分散と、最終層パラメーターのフルランクエラー共分散を更新します。
これは不適切な後方を特徴づけますが、結果として生じる後部予測分布が明確に定義されていることを示します。
当社の方法は、すべてのネットワークパラメーターをオンラインで更新し、リプレイバッファーやオフラインの再訓練を必要としません。
経験的に、私たちの方法は、（非定常的な）文脈的盗賊の問題とベイズの最適化の問題の速度と精度の間の競争的トレードオフを達成することを経験的に示しています。

要約(オリジナル)

We introduce scalable algorithms for online learning and generalized Bayesian inference of neural network parameters, designed for sequential decision making tasks. Our methods combine the strengths of frequentist and Bayesian filtering, which include fast low-rank updates via a block-diagonal approximation of the parameter error covariance, and a well-defined posterior predictive distribution that we use for decision making. More precisely, our main method updates a low-rank error covariance for the hidden layers parameters, and a full-rank error covariance for the final layer parameters. Although this characterizes an improper posterior, we show that the resulting posterior predictive distribution is well-defined. Our methods update all network parameters online, with no need for replay buffers or offline retraining. We show, empirically, that our methods achieve a competitive tradeoff between speed and accuracy on (non-stationary) contextual bandit problems and Bayesian optimization problems.

arxiv情報

著者	Gerardo Duran-Martin,Leandro Sánchez-Betancourt,Álvaro Cartea,Kevin Murphy
発行日	2025-06-13 15:44:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Scalable Generalized Bayesian Online Neural Network Training for Sequential Decision Making

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー