LLMs as Data Annotators: How Close Are We to Human Performance

要約

NLPでは、微調整LLMはさまざまなアプリケーションに効果的ですが、高品質の注釈付きデータが必要です。
ただし、データの手動注釈は労働集約的で、時間がかかり、費用がかかります。
したがって、LLMはプロセスを自動化するためにますます使用され、多くの場合、タスクに関連するいくつかの例がプロンプトに与えられてパフォーマンスを向上させることができます。
ただし、コンテキストの例を手動で選択すると、非効率性と最適ではないモデルのパフォーマンスが発生する可能性があります。
このペーパーでは、指定されたエンティティ認識（NER）タスクのさまざまなデータセットにわたって、さまざまな埋め込みモデルを考慮して、いくつかのLLMを比較した包括的な実験を紹介します。
この評価には、独自モデルと非専用モデルの両方を含む、約7ドルの$ 7および$ 70 $ Bパラメーターのモデルが含まれます。
さらに、検索された生成（RAG）の成功を活用して、コンテキストの例を自動的に取得してパフォーマンスを向上させることにより、ICLの制限に対処する方法も考慮します。
結果は、適切なLLMと埋め込みモデルを選択し、LLMサイズと望ましいパフォーマンスの間のトレードオフを理解すること、およびより挑戦的なデータセットに研究努力を向ける必要性を強調しています。

要約(オリジナル)

In NLP, fine-tuning LLMs is effective for various applications but requires high-quality annotated data. However, manual annotation of data is labor-intensive, time-consuming, and costly. Therefore, LLMs are increasingly used to automate the process, often employing in-context learning (ICL) in which some examples related to the task are given in the prompt for better performance. However, manually selecting context examples can lead to inefficiencies and suboptimal model performance. This paper presents comprehensive experiments comparing several LLMs, considering different embedding models, across various datasets for the Named Entity Recognition (NER) task. The evaluation encompasses models with approximately $7$B and $70$B parameters, including both proprietary and non-proprietary models. Furthermore, leveraging the success of Retrieval-Augmented Generation (RAG), it also considers a method that addresses the limitations of ICL by automatically retrieving contextual examples, thereby enhancing performance. The results highlight the importance of selecting the appropriate LLM and embedding model, understanding the trade-offs between LLM sizes and desired performance, and the necessity to direct research efforts towards more challenging datasets.

arxiv情報

著者	Muhammad Uzair Ul Haq,Davide Rigoni,Alessandro Sperduti
発行日	2025-04-21 11:11:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LLMs as Data Annotators: How Close Are We to Human Performance

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー