Bag of Tricks for In-Distribution Calibration of Pretrained Transformers

要約

事前トレーニング済み言語モデル (PLM) は、テキスト分類タスクの精度を促進する事実上の標準となっていますが、最近の研究では、PLM が自信過剰に予測することが多いことがわかっています。
アンサンブル学習やデータ拡張など、さまざまなキャリブレーション方法が提案されていますが、ほとんどの方法は、PLM ベースのテキスト分類タスクではなく、コンピュータービジョンベンチマークで検証されています。
このホワイトペーパーでは、PLM の信頼性キャリブレーションに関する実証的研究を紹介し、信頼性ペナルティ損失、データ増強、およびアンサンブル法を含む 3 つのカテゴリに対処します。
トレーニングセットにオーバーフィットされたアンサンブルモデルは標準以下のキャリブレーションパフォーマンスを示し、また信頼性ペナルティ損失でトレーニングされた PLM はキャリブレーションと精度の間にトレードオフがあることがわかります。
これらの観察結果に基づいて、校正技術の組み合わせである Calibrated PLM (CALL) を提案します。
CALL は、キャリブレーション方法を個別に利用する場合に発生する可能性のある欠点を補完し、分類とキャリブレーションの両方の精度を高めます。
CALL のトレーニング手順における設計上の選択は広範囲に研究されており、キャリブレーション技術が PLM のキャリブレーションパフォーマンスにどのように影響するかについて詳細な分析を提供します。

要約(オリジナル)

While pre-trained language models (PLMs) have become a de-facto standard promoting the accuracy of text classification tasks, recent studies find that PLMs often predict over-confidently. Although various calibration methods have been proposed, such as ensemble learning and data augmentation, most of the methods have been verified in computer vision benchmarks rather than in PLM-based text classification tasks. In this paper, we present an empirical study on confidence calibration for PLMs, addressing three categories, including confidence penalty losses, data augmentations, and ensemble methods. We find that the ensemble model overfitted to the training set shows sub-par calibration performance and also observe that PLMs trained with confidence penalty loss have a trade-off between calibration and accuracy. Building on these observations, we propose the Calibrated PLM (CALL), a combination of calibration techniques. The CALL complements the drawbacks that may occur when utilizing a calibration method individually and boosts both classification and calibration accuracy. Design choices in CALL’s training procedures are extensively studied, and we provide a detailed analysis of how calibration techniques affect the calibration performance of PLMs.

arxiv情報

著者	Jaeyoung Kim,Dongbin Na,Sungchul Choi,Sungbin Lim
発行日	2023-02-13 21:11:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Bag of Tricks for In-Distribution Calibration of Pretrained Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー