Intuitive or Dependent? Investigating LLMs’ Robustness to Conflicting Prompts

要約

この論文では、ノイズやタスク設定により、実際のアプリケーションでは対照的な情報が含まれる可能性がある、内部メモリまたは指定されたプロンプトに対する LLM の優先度の堅牢性について調査します。
この目的のために、私たちは定量的なベンチマークフレームワークを確立し、LLMの好みを制御するためのロールプレイング介入を実施します。
具体的には、プロンプトまたは記憶から正しい事実を特定する能力を対象とした事実の堅牢性と、一貫した選択を行う際の LLM の行動を分類するための意思決定スタイル (最終的な「正しい」答えがないと仮定した場合) の 2 種類の堅牢性を定義します。
直感的、依存的、または認知理論に基づいた合理的。
7 つのオープンソースおよびクローズドソース LLM に関する広範な実験から得られた私たちの調査結果は、これらのモデルが、特に常識的な知識を教える場合に、誤解を招くプロンプトの影響を非常に受けやすいことを明らかにしています。
詳細な指示により、誤解を招く回答の選択を軽減できますが、無効な回答の発生率も増加します。
好みを解明した後、特定のスタイルの役割指示を通じてさまざまなサイズの LLM を介入させ、堅牢性と適応性のさまざまな上限を示します。

要約(オリジナル)

This paper explores the robustness of LLMs’ preference to their internal memory or the given prompt, which may contain contrasting information in real-world applications due to noise or task settings. To this end, we establish a quantitative benchmarking framework and conduct the role playing intervention to control LLMs’ preference. In specific, we define two types of robustness, factual robustness targeting the ability to identify the correct fact from prompts or memory, and decision style to categorize LLMs’ behavior in making consistent choices — assuming there is no definitive ‘right’ answer — intuitive, dependent, or rational based on cognitive theory. Our findings, derived from extensive experiments on seven open-source and closed-source LLMs, reveal that these models are highly susceptible to misleading prompts, especially for instructing commonsense knowledge. While detailed instructions can mitigate the selection of misleading answers, they also increase the incidence of invalid responses. After Unraveling the preference, we intervene different sized LLMs through specific style of role instruction, showing their varying upper bound of robustness and adaptivity.

arxiv情報

著者	Jiahao Ying,Yixin Cao,Kai Xiong,Yidong He,Long Cui,Yongbin Liu
発行日	2023-09-29 17:26:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Intuitive or Dependent? Investigating LLMs’ Robustness to Conflicting Prompts

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー