SLIMER-IT: Zero-Shot NER on Italian Language

要約

固有表現認識 (NER) に対する従来のアプローチでは、タスクを BIO シーケンスのラベル付け問題に組み込んでいます。
これらのシステムは多くの場合、当面の下流タスクでは優れていますが、大量の注釈付きデータが必要であり、配布外の入力ドメインや目に見えないエンティティタイプに一般化するのに苦労しています。
それどころか、大規模言語モデル (LLM) は強力なゼロショット機能を実証しています。
英語でゼロショット NER を扱った作品はいくつかありますが、他の言語ではほとんど行われていません。
この論文では、ゼロショット NER の評価フレームワークを定義し、それをイタリア語に適用します。
さらに、SLIMER のイタリア版である SLIMER-IT を紹介します。これは、定義とガイドラインが充実したプロンプトを活用した、ゼロショット NER の命令チューニングアプローチです。
他の最先端モデルとの比較により、これまでに見たことのないエンティティタグにおける SLIMER-IT の優位性が実証されています。

要約(オリジナル)

Traditional approaches to Named Entity Recognition (NER) frame the task into a BIO sequence labeling problem. Although these systems often excel in the downstream task at hand, they require extensive annotated data and struggle to generalize to out-of-distribution input domains and unseen entity types. On the contrary, Large Language Models (LLMs) have demonstrated strong zero-shot capabilities. While several works address Zero-Shot NER in English, little has been done in other languages. In this paper, we define an evaluation framework for Zero-Shot NER, applying it to the Italian language. Furthermore, we introduce SLIMER-IT, the Italian version of SLIMER, an instruction-tuning approach for zero-shot NER leveraging prompts enriched with definition and guidelines. Comparisons with other state-of-the-art models, demonstrate the superiority of SLIMER-IT on never-seen-before entity tags.

arxiv情報

著者	Andrew Zamai,Leonardo Rigutini,Marco Maggini,Andrea Zugarini
発行日	2024-11-14 13:59:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SLIMER-IT: Zero-Shot NER on Italian Language

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー