Rethinking Approximate Gaussian Inference in Classification

要約

分類タスクでは、SoftMax関数は、予測確率を生成するための出力アクティベーションとして遍在的に使用されます。
このような出力は、aleatoricの不確実性のみをキャプチャします。
認識論的不確実性を捉えるために、ロジット空間にガウス分布を出力する近似ガウス推論方法が提案されています。
その後、予測力は、ソフトマックスを通して前進するガウス分布の期待として得られます。
ただし、このようなソフトマックスガウス積分は分析的に解決することはできず、モンテカルロ（MC）近似は費用がかかり、うるさいことがあります。
学習目標の単純な変更を提案します。これにより、予測装置の正確な計算を可能にし、ランタイムやメモリのオーバーヘッドなしで改善されたトレーニングダイナミクスを享受します。
このフレームワークは、ソフトマックスを含む出力活性化関数のファミリー、および要素ごとのNORMCDFおよびSIGMOIDと互換性があります。
さらに、分析モーメントマッチングにより、ガウスをDirichlet分布でプッシュフォワードすることができます。
大規模および小規模データセット（Imagenet、CIFAR-10）で、いくつかの近似ガウス推論方法（Laplace、Het、SNGP）と組み合わせたアプローチを評価し、ソフトマックスMCサンプリングと比較して不確実性の定量化機能の改善を示します。
コードはhttps://github.com/bmucsanyi/probitで入手できます。

要約(オリジナル)

In classification tasks, softmax functions are ubiquitously used as output activations to produce predictive probabilities. Such outputs only capture aleatoric uncertainty. To capture epistemic uncertainty, approximate Gaussian inference methods have been proposed, which output Gaussian distributions over the logit space. Predictives are then obtained as the expectations of the Gaussian distributions pushed forward through the softmax. However, such softmax Gaussian integrals cannot be solved analytically, and Monte Carlo (MC) approximations can be costly and noisy. We propose a simple change in the learning objective which allows the exact computation of predictives and enjoys improved training dynamics, with no runtime or memory overhead. This framework is compatible with a family of output activation functions that includes the softmax, as well as element-wise normCDF and sigmoid. Moreover, it allows for approximating the Gaussian pushforwards with Dirichlet distributions by analytic moment matching. We evaluate our approach combined with several approximate Gaussian inference methods (Laplace, HET, SNGP) on large- and small-scale datasets (ImageNet, CIFAR-10), demonstrating improved uncertainty quantification capabilities compared to softmax MC sampling. Code is available at https://github.com/bmucsanyi/probit.

arxiv情報

著者	Bálint Mucsányi,Nathaël Da Costa,Philipp Hennig
発行日	2025-02-05 17:03:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Rethinking Approximate Gaussian Inference in Classification

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー