Larger Language Models Don’t Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks

要約

大規模言語モデル (LLM) のインコンテキスト学習 (ICL) は、勾配ベースの方法でモデルパラメーターを更新する必要がないため、自然言語タスクを実行するための主要な手法として浮上しています。
ICL は、LLM をわずかな計算コストで競合レベルまたは最先端のレベルで実行できるように LLM を「適応」させることを約束します。
ICL は、最終的なラベルに到達する推論プロセスをプロンプトに明示的に組み込むことによって強化できます。これは、Chain-of-Thought (CoT) プロンプトと呼ばれる手法です。
しかし、最近の研究では、ICL は主にタスクの事前確率の取得に依存しており、特に感情や道徳などの複雑な主観的領域では事前確率が事後予測を骨化する場合、タスクを実行するための「学習」にはあまり依存していないことが判明しました。
この研究では、推論を「有効にする」ことで LLM でも同じ動作が生じるかどうかを調べます。ここで、CoT の形式は、プロンプト内の証拠にもかかわらず比較的変化しない推論事前確率を取得します。
驚くべきことに、CoT は実際に、より大きな言語モデルの場合、ICL と同じ事後崩壊に悩まされることがわかりました。
コードは https://github.com/gchochla/cot-priors で入手できます。

要約(オリジナル)

In-Context Learning (ICL) in Large Language Models (LLM) has emerged as the dominant technique for performing natural language tasks, as it does not require updating the model parameters with gradient-based methods. ICL promises to ‘adapt’ the LLM to perform the present task at a competitive or state-of-the-art level at a fraction of the computational cost. ICL can be augmented by incorporating the reasoning process to arrive at the final label explicitly in the prompt, a technique called Chain-of-Thought (CoT) prompting. However, recent work has found that ICL relies mostly on the retrieval of task priors and less so on ‘learning’ to perform tasks, especially for complex subjective domains like emotion and morality, where priors ossify posterior predictions. In this work, we examine whether ‘enabling’ reasoning also creates the same behavior in LLMs, wherein the format of CoT retrieves reasoning priors that remain relatively unchanged despite the evidence in the prompt. We find that, surprisingly, CoT indeed suffers from the same posterior collapse as ICL for larger language models. Code is avalaible at https://github.com/gchochla/cot-priors.

arxiv情報

著者	Georgios Chochlakis,Niyantha Maruthu Pandiyan,Kristina Lerman,Shrikanth Narayanan
発行日	2024-09-17 17:42:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Larger Language Models Don’t Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー