Don’t Just Pay Attention, PLANT It: Transfer L2R Models to Fine-tune Attention in Extreme Multi-Label Text Classification

要約

最先端のエクストリームマルチラベルテキスト分類 (XMTC) モデルは、入力テキスト内のキートークンに焦点を当てるためにマルチラベルアテンションレイヤーに大きく依存していますが、最適なアテンションの重みを取得するのは困難であり、リソースを大量に消費します。
これに対処するために、XMTC デコーダを微調整するための新しい転移学習戦略である PLANT (事前トレーニングと活用された注意) を導入します。
PLANT は、mimicfull、mimicfifty、mimicfour、eurlex、wikiten データセットのすべてのメトリクスにわたって、既存の最先端の手法を上回っています。
特に少数ショットのシナリオで優れており、少数ショットのシナリオ用に特別に設計された以前のモデルを mimicrare の F1 スコアで 50 パーセント以上、mimicfew で 36 パーセント以上上回り、レアコードの処理における優れた能力を示しています。
PLANT はショット数が少ないシナリオでも顕著なデータ効率を示し、大幅に少ないデータで従来のモデルに匹敵する精度を達成します。
これらの結果は、主要な技術革新によって達成されます。つまり、事前トレーニング済みの Learning-to-Rank モデルを植え付けられた注意層として利用し、相互情報利得を統合して注意を強化し、不注意メカニズムを導入し、コンテキストを維持するためのステートフルデコーダを実装します。
包括的なアブレーション研究により、パフォーマンスの向上を実現する上でのこれらの貢献の重要性が検証されています。

要約(オリジナル)

State-of-the-art Extreme Multi-Label Text Classification (XMTC) models rely heavily on multi-label attention layers to focus on key tokens in input text, but obtaining optimal attention weights is challenging and resource-intensive. To address this, we introduce PLANT — Pretrained and Leveraged AtteNTion — a novel transfer learning strategy for fine-tuning XMTC decoders. PLANT surpasses existing state-of-the-art methods across all metrics on mimicfull, mimicfifty, mimicfour, eurlex, and wikiten datasets. It particularly excels in few-shot scenarios, outperforming previous models specifically designed for few-shot scenarios by over 50 percentage points in F1 scores on mimicrare and by over 36 percentage points on mimicfew, demonstrating its superior capability in handling rare codes. PLANT also shows remarkable data efficiency in few-shot scenarios, achieving precision comparable to traditional models with significantly less data. These results are achieved through key technical innovations: leveraging a pretrained Learning-to-Rank model as the planted attention layer, integrating mutual-information gain to enhance attention, introducing an inattention mechanism, and implementing a stateful-decoder to maintain context. Comprehensive ablation studies validate the importance of these contributions in realizing the performance gains.

arxiv情報

著者	Debjyoti Saharoy,Javed A. Aslam,Virgil Pavlu
発行日	2024-10-30 14:41:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Don’t Just Pay Attention, PLANT It: Transfer L2R Models to Fine-tune Attention in Extreme Multi-Label Text Classification

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー