Training Ensembles with Inliers and Outliers for Semi-supervised Active Learning

要約

異常値の例が存在する場合のディープアクティブラーニングは、現実的ではあるが困難なシナリオを引き起こします。
アノテーション用にラベルのないデータを取得するには、アノテーションの予算を節約するために外れ値を回避することと、効果的なトレーニングのために有用なインライアの例を優先することとの間の微妙なバランスが必要です。
この研究では、重要な要素として特定されている 3 つの非常に相乗効果のあるコンポーネント、つまり、内値と外れ値を使用した共同分類器トレーニング、擬似ラベルによる半教師あり学習、およびモデルのアンサンブルを活用するアプローチを紹介します。
私たちの研究は、アンサンブルによって擬似ラベル付けの精度が大幅に向上し、データ取得の品質が向上することを示しています。
異常値が適切に処理される共同トレーニングプロセスを通じて半監視を有効にすることで、利用可能なすべてのラベルなしサンプルの使用を通じて分類器の精度が大幅に向上することがわかります。
特に、共同トレーニングを統合すると、明示的な外れ値検出が不要になることが明らかになりました。
以前の研究で取得した従来のコンポーネント。
3 つの主要なコンポーネントは、多数の既存のアプローチとシームレスに連携します。
実証的な評価を通じて、これらを組み合わせて使用するとパフォーマンスが向上することを示します。
驚くべきことに、その単純さにもかかわらず、私たちが提案したアプローチはパフォーマンスの点で他のすべての方法よりも優れています。
コード: https://github.com/vladan-stojnic/active-outliers

要約(オリジナル)

Deep active learning in the presence of outlier examples poses a realistic yet challenging scenario. Acquiring unlabeled data for annotation requires a delicate balance between avoiding outliers to conserve the annotation budget and prioritizing useful inlier examples for effective training. In this work, we present an approach that leverages three highly synergistic components, which are identified as key ingredients: joint classifier training with inliers and outliers, semi-supervised learning through pseudo-labeling, and model ensembling. Our work demonstrates that ensembling significantly enhances the accuracy of pseudo-labeling and improves the quality of data acquisition. By enabling semi-supervision through the joint training process, where outliers are properly handled, we observe a substantial boost in classifier accuracy through the use of all available unlabeled examples. Notably, we reveal that the integration of joint training renders explicit outlier detection unnecessary; a conventional component for acquisition in prior work. The three key components align seamlessly with numerous existing approaches. Through empirical evaluations, we showcase that their combined use leads to a performance increase. Remarkably, despite its simplicity, our proposed approach outperforms all other methods in terms of performance. Code: https://github.com/vladan-stojnic/active-outliers

arxiv情報

著者	Vladan Stojnić,Zakaria Laskar,Giorgos Tolias
発行日	2023-07-07 17:50:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Training Ensembles with Inliers and Outliers for Semi-supervised Active Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー