IDoFew: Intermediate Training Using Dual-Clustering in Language Models for Few Labels Text Classification

要約

Bidirectional Encoder Representations from Transformers (BERT) などの言語モデルは、さまざまな自然言語処理 (NLP) やテキスト分類などのテキストマイニングタスクにおいて非常に効果的です。
ただし、限られたラベルによるテキスト分類など、一部のタスクは依然としてこれらのモデルに課題をもたらします。
これにより、コールドスタートの問題が発生する可能性があります。
いくつかのアプローチでは、分類を改善するために擬似ラベルを生成する事前トレーニング済み言語モデルと組み合わせた、中間トレーニングステップとしての 1 段階クラスタリングを通じてこの問題に対処しようと試みていますが、これらの方法は、多くの場合エラーが発生しやすいものです。
クラスタリングアルゴリズム。
これを克服するために、擬似ラベルを確実にモデル化し、予測誤差を低減する、その後の微調整を伴う新しい 2 段階の中間クラスタリングを開発しました。
私たちのモデル IDoFew の主な新しさは、2 段階のクラスタリングと 2 つの異なるクラスタリングアルゴリズムを組み合わせることで、微調整のための信頼できる疑似ラベルを生成する際のエラーを減らす相補的なアルゴリズムの利点を活用できることです。
私たちのアプローチは、強力な比較モデルと比較して大幅な改善を示しました。

要約(オリジナル)

Language models such as Bidirectional Encoder Representations from Transformers (BERT) have been very effective in various Natural Language Processing (NLP) and text mining tasks including text classification. However, some tasks still pose challenges for these models, including text classification with limited labels. This can result in a cold-start problem. Although some approaches have attempted to address this problem through single-stage clustering as an intermediate training step coupled with a pre-trained language model, which generates pseudo-labels to improve classification, these methods are often error-prone due to the limitations of the clustering algorithms. To overcome this, we have developed a novel two-stage intermediate clustering with subsequent fine-tuning that models the pseudo-labels reliably, resulting in reduced prediction errors. The key novelty in our model, IDoFew, is that the two-stage clustering coupled with two different clustering algorithms helps exploit the advantages of the complementary algorithms that reduce the errors in generating reliable pseudo-labels for fine-tuning. Our approach has shown significant improvements compared to strong comparative models.

arxiv情報

著者	Abdullah Alsuhaibani,Hamad Zogan,Imran Razzak,Shoaib Jameel,Guandong Xu
発行日	2024-01-08 17:07:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

IDoFew: Intermediate Training Using Dual-Clustering in Language Models for Few Labels Text Classification

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー