Revisiting Confidence Estimation: Towards Reliable Failure Prediction

要約

信頼性の高い信頼性の推定は、リスクに敏感な多くのアプリケーションにおいて、困難ではあるものの基本的な要件です。
ただし、最新のディープニューラルネットワークは、誤った予測、つまり、既知のクラスからの誤分類されたサンプルや、未知のクラスからの分布外 (OOD) サンプルについて過信することがよくあります。
近年、多くの信頼度校正および OOD 検出方法が開発されています。
この論文では、ほとんどの信頼度推定手法が誤分類エラーの検出には有害であるという、一般的で広く存在しているにもかかわらず実際には無視されている現象を発見しました。
私たちはこの問題を調査し、一般的なキャリブレーションおよび OOD 検出方法では、正しく分類された例と誤って分類された例の間の信頼性分離が悪化することが多く、予測を信頼するかどうかの決定を困難にしていることが明らかになりました。
最後に、平坦な最小値を見つけることで信頼ギャップを拡大することを提案します。これにより、バランスのとれた、ロングテール、共変量シフトの分類シナリオを含むさまざまな設定の下で最先端の故障予測パフォーマンスが得られます。
私たちの研究は、信頼性の高い信頼性推定のための強力なベースラインを提供するだけでなく、キャリブレーション、OOD 検出、故障予測の理解の間の橋渡しとしても機能します。
コードは \url{https://github.com/Impression2805/FMFP} で入手できます。

要約(オリジナル)

Reliable confidence estimation is a challenging yet fundamental requirement in many risk-sensitive applications. However, modern deep neural networks are often overconfident for their incorrect predictions, i.e., misclassified samples from known classes, and out-of-distribution (OOD) samples from unknown classes. In recent years, many confidence calibration and OOD detection methods have been developed. In this paper, we find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors. We investigate this problem and reveal that popular calibration and OOD detection methods often lead to worse confidence separation between correctly classified and misclassified examples, making it difficult to decide whether to trust a prediction or not. Finally, we propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance under various settings including balanced, long-tailed, and covariate-shift classification scenarios. Our study not only provides a strong baseline for reliable confidence estimation but also acts as a bridge between understanding calibration, OOD detection, and failure prediction. The code is available at \url{https://github.com/Impression2805/FMFP}.

arxiv情報

著者	Fei Zhu,Xu-Yao Zhang,Zhen Cheng,Cheng-Lin Liu
発行日	2024-03-05 11:44:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Revisiting Confidence Estimation: Towards Reliable Failure Prediction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー