Enhancing Small Language Models for Cross-Lingual Generalized Zero-Shot Classification with Soft Prompt Tuning

要約

NLPでは、ゼロショット分類（ZSC）が、モデルがトレーニング中に見えないカテゴリにテキストを分類することを可能にするために不可欠になりました。
ZSCでは、事前に保護された言語モデル（PLMS）が有望であるが、多くの場合、大規模なトレーニングデータセットまたは外部知識に依存しており、多言語および低リソースシナリオでの適用性を制限しています。
自然言語のプロンプトを活用する最近のアプローチは、大規模なトレーニングデータセットへの依存を減らしますが、特にこれらのデータセットが異なる言語または分布に由来する場合、関連する分類タスクから利用可能なラベル付けされたデータを効果的に組み込むのに苦労しています。
さらに、既存のプロンプトベースの方法は、通常、特定の言語で手動で作成されたプロンプトに依存しており、順応性のある設定における適応性と有効性を制限します。
これらの課題に対処するために、データ分布のシフト全体で堅牢な一般化を確保しながら、言語間ZSCを強化するソフトプロンプトをトレーニングするための軽量でデータ効率の高いアプローチであるRospromptを紹介します。
ROSPROMPTは、小さな多言語PLMS向けに設計されており、高リソース言語を活用して、大規模な微調整または高い計算コストを必要とせずに低リソース設定でのパフォーマンスを改善することができます。
106の言語をカバーするデータセット全体の複数の多言語PLMでアプローチを評価し、目に見えないクラスにわたって強力な横断的転送パフォーマンスと堅牢な一般化機能を実証します。

要約(オリジナル)

In NLP, Zero-Shot Classification (ZSC) has become essential for enabling models to classify text into categories unseen during training, particularly in low-resource languages and domains where labeled data is scarce. While pretrained language models (PLMs) have shown promise in ZSC, they often rely on large training datasets or external knowledge, limiting their applicability in multilingual and low-resource scenarios. Recent approaches leveraging natural language prompts reduce the dependence on large training datasets but struggle to effectively incorporate available labeled data from related classification tasks, especially when these datasets originate from different languages or distributions. Moreover, existing prompt-based methods typically rely on manually crafted prompts in a specific language, limiting their adaptability and effectiveness in cross-lingual settings. To address these challenges, we introduce RoSPrompt, a lightweight and data-efficient approach for training soft prompts that enhance cross-lingual ZSC while ensuring robust generalization across data distribution shifts. RoSPrompt is designed for small multilingual PLMs, enabling them to leverage high-resource languages to improve performance in low-resource settings without requiring extensive fine-tuning or high computational costs. We evaluate our approach on multiple multilingual PLMs across datasets covering 106 languages, demonstrating strong cross-lingual transfer performance and robust generalization capabilities over unseen classes.

arxiv情報

著者	Fred Philippy,Siwen Guo,Cedric Lothritz,Jacques Klein,Tegawendé F. Bissyandé
発行日	2025-03-28 09:23:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Enhancing Small Language Models for Cross-Lingual Generalized Zero-Shot Classification with Soft Prompt Tuning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー