Classical Out-of-Distribution Detection Methods Benchmark in Text Classification Tasks

要約

最先端のモデルは、制御された環境では良好に機能しますが、配布外 (OOD) の例が提示されると困難になることが多く、OOD 検出が NLP システムの重要なコンポーネントとなっています。
このペーパーでは、NLP における OOD 検出に対する既存のアプローチの限界を強調することに焦点を当てます。
具体的には、既存の NLP システムに簡単に統合でき、追加の OOD データやモデルの変更を必要としない 8 つの OOD 検出方法を評価しました。
私たちの貢献の 1 つは、結果の完全な再現性を可能にする、よく構造化された研究環境を提供することです。
さらに、私たちの分析は、NLP タスク用の既存の OOD 検出方法が、さまざまなタイプの分布シフトを特徴とするすべてのサンプルを捕捉するにはまだ十分な感度を備えていないことを示しています。
特に困難なテストシナリオは、ドメインテキスト内で背景が変化したり、語順がランダムにシャッフルされたりする場合に発生します。
これは、NLP 問題に対するより効果的な OOD 検出アプローチを開発する将来の研究の必要性を強調しており、私たちの研究は、この分野のさらなる研究のための明確な基盤を提供します。

要約(オリジナル)

State-of-the-art models can perform well in controlled environments, but they often struggle when presented with out-of-distribution (OOD) examples, making OOD detection a critical component of NLP systems. In this paper, we focus on highlighting the limitations of existing approaches to OOD detection in NLP. Specifically, we evaluated eight OOD detection methods that are easily integrable into existing NLP systems and require no additional OOD data or model modifications. One of our contributions is providing a well-structured research environment that allows for full reproducibility of the results. Additionally, our analysis shows that existing OOD detection methods for NLP tasks are not yet sufficiently sensitive to capture all samples characterized by various types of distributional shifts. Particularly challenging testing scenarios arise in cases of background shift and randomly shuffled word order within in domain texts. This highlights the need for future work to develop more effective OOD detection approaches for the NLP problems, and our work provides a well-defined foundation for further research in this area.

arxiv情報

著者	Mateusz Baran,Joanna Baran,Mateusz Wójcik,Maciej Zięba,Adam Gonczarek
発行日	2023-07-13 18:06:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Classical Out-of-Distribution Detection Methods Benchmark in Text Classification Tasks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー