Aggregation Artifacts in Subjective Tasks Collapse Large Language Models’ Posteriors

要約

インコンテキスト学習 (ICL) は、大規模言語モデル (LLM) を使用して自然言語タスクを実行するための主要な方法となっています。
事前トレーニング中に取得した知識は、この数ショット機能にとって非常に重要であり、モデルに事前タスクを提供します。
しかし、最近の研究では、ICL はタスクを実行するための「学習」ではなく、主にタスクの事前情報の取得に依存していることが示されています。
この制限は、事前予測が事後予測に大きく影響する、感情や道徳などの複雑な主観的領域で特に顕著です。
この研究では、これが対応するデータセットで使用されている集計の結果であるかどうかを調べます。一致度の低い異種のアノテーションを結合しようとすると、プロンプトに有害なノイズを生み出すアノテーションのアーティファクトが発生する可能性があります。
さらに、LLM 事前分布の適切な定量的尺度に基づいて研究を行うことにより、特定のアノテーターに対する事後バイアスを評価します。
私たちの結果は、集合が主観的なタスクのモデル化における交絡因子であることを示しており、代わりに個人のモデル化に焦点を当てることを提唱しています。
ただし、集計だけでは ICL と最先端技術との間のギャップ全体を説明できるわけではありません。つまり、このようなタスクにおける他の要因も観察された現象の原因となることを意味します。
最後に、アノテーターレベルのラベルを厳密に研究することで、少数派のアノテーターが LLM とよりよく連携し、彼らの視点をさらに強化できることがわかりました。

要約(オリジナル)

In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs). The knowledge acquired during pre-training is crucial for this few-shot capability, providing the model with task priors. However, recent studies have shown that ICL predominantly relies on retrieving task priors rather than ‘learning’ to perform tasks. This limitation is particularly evident in complex subjective domains such as emotion and morality, where priors significantly influence posterior predictions. In this work, we examine whether this is the result of the aggregation used in corresponding datasets, where trying to combine low-agreement, disparate annotations might lead to annotation artifacts that create detrimental noise in the prompt. Moreover, we evaluate the posterior bias towards certain annotators by grounding our study in appropriate, quantitative measures of LLM priors. Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead. However, aggregation does not explain the entire gap between ICL and the state of the art, meaning other factors in such tasks also account for the observed phenomena. Finally, by rigorously studying annotator-level labels, we find that it is possible for minority annotators to both better align with LLMs and have their perspectives further amplified.

arxiv情報

著者	Georgios Chochlakis,Alexandros Potamianos,Kristina Lerman,Shrikanth Narayanan
発行日	2024-10-17 17:16:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Aggregation Artifacts in Subjective Tasks Collapse Large Language Models’ Posteriors

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー