Large Language Models are Null-Shot Learners

要約

この文書では、ヌルショットプロンプトについて説明します。
ヌルショットプロンプトは、タスクを実行するために提供されたコンテキスト内には存在しない「例」セクションの情報を利用するように LLM に指示することで、LLM の幻覚を悪用します。
LLM を日常的かつ重要な用途に使用する場合、幻覚を軽減することは極めて重要かつ無視できないものですが、これらの LLM が依然として幻覚を呈する現在の状況では、実際に幻覚を利用して標準的なタスクと比較してタスク実行のパフォーマンスを向上させることが可能であると提案します。
ゼロショットプロンプト。
6 つの LLM を用いた実験では、読解力、算術推論、クローズドブック質問応答など、8 つのデータセットの大部分でパフォーマンスの向上が見られました。
LLM 間での相対的なパフォーマンスの向上に観察された不一致は、各モデルにおける固有の幻覚の程度が異なることを潜在的に示しています。
これらの違いは、既存のベンチマークデータセットを使用して LLM の幻覚の程度を検出する方法としてヌルショットプロンプトを利用できることを示しています。
また、ゼロショット思考連鎖プロンプティングのアイデアを組み込んだヌルショットプロンプティングの修正版の実験など、アブレーション研究も行っており、さまざまな結果の傾向が示されています。

要約(オリジナル)

This paper presents null-shot prompting. Null-shot prompting exploits hallucination in large language models (LLMs) by instructing LLMs to utilize information from the ‘Examples’ section that never exists within the provided context to perform a task. While reducing hallucination is crucial and non-negligible for daily and critical uses of LLMs, we propose that in the current landscape in which these LLMs still hallucinate, it is possible, in fact, to exploit hallucination to increase performance in performing tasks compared to standard zero-shot prompting. Experiments with six LLMs show improvements in performance across the majority of eight datasets, including reading comprehension, arithmetic reasoning, and closed-book question answering. The observed inconsistency in increased relative performance across LLMs also potentially indicates a different degree of inherent hallucination in each model. These differences show that it is possible to utilize null-shot prompting as a way to detect degrees of hallucination in LLMs using existing benchmarking datasets. We also perform ablation studies, including experimenting with a modified version of null-shot prompting that incorporates ideas from zero-shot chain-of-thought prompting, which shows different trends of results.

arxiv情報

著者	Pittawat Taveekitworachai,Febri Abdullah,Ruck Thawonmas
発行日	2024-01-16 10:53:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Large Language Models are Null-Shot Learners

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー