Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference

要約

自然言語推論 (NLI) 仮説を作成するためにクラウドソーシングワーカーを LLM に置き換えても、同様にアノテーションアーティファクトが発生するかどうかをテストします。
GPT-4、Llama-2、Mistral 7b を使用してスタンフォード NLI コーパスの一部を再作成し、仮説のみの分類器をトレーニングして、LLM によって導き出された仮説にアノテーションアーティファクトが含まれているかどうかを判断します。
LLM によって導出された NLI データセットでは、BERT ベースの仮説のみの分類器は 86 ～ 96% の精度を達成しており、これらのデータセットには仮説のみのアーティファクトが含まれていることを示しています。
また、LLM によって生成された仮説には、頻繁に「ギブアウェイ」が見られます。
「プールで泳ぐ」というフレーズは、GPT-4 によって生成された 10,000 以上の矛盾の中に出現します。
私たちの分析は、NLI における十分に証明されたバイアスが LLM で生成されたデータに存続する可能性があるという経験的証拠を提供します。

要約(オリジナル)

We test whether replacing crowdsource workers with LLMs to write Natural Language Inference (NLI) hypotheses similarly results in annotation artifacts. We recreate a portion of the Stanford NLI corpus using GPT-4, Llama-2 and Mistral 7b, and train hypothesis-only classifiers to determine whether LLM-elicited hypotheses contain annotation artifacts. On our LLM-elicited NLI datasets, BERT-based hypothesis-only classifiers achieve between 86-96% accuracy, indicating these datasets contain hypothesis-only artifacts. We also find frequent ‘give-aways’ in LLM-generated hypotheses, e.g. the phrase ‘swimming in a pool’ appears in more than 10,000 contradictions generated by GPT-4. Our analysis provides empirical evidence that well-attested biases in NLI can persist in LLM-generated data.

arxiv情報

著者	Grace Proebsting,Adam Poliak
発行日	2024-10-11 17:09:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー