Toward Adaptive Reasoning in Large Language Models with Thought Rollback

要約

大規模言語モデル (LLM) は、段階的な推論を使用してさまざまなタスクを解決するために日常的に使用されてきました。
ただし、中間の推論ステップまたは思考の構造は、チェーン、ツリー、または非循環有向グラフなど、厳格で一方向です。
その結果、結果として生じる柔軟性に欠けた前向きのみの推論は、困難なタスクに対処できず、LLM が誤った応答、つまり「幻覚」を頻繁に返す場合に失敗する可能性があります。
この論文は、思考ロールバック (TR) と呼ばれる新しい推論フレームワークを提案します。これにより、LLM は、「幻覚」の下で問題解決に向けた効果的な推論を維持しながら、適応的に思考構造を構築できます。
TR の中核となるメカニズムは思考のロールバックです。これにより、LLM は思考のエラー分析を実行できるため、以前に間違った思考を修正のためにロールバックできます。
その後、LLM をガイドするプロンプトにそのような試行錯誤を含めることで、各ロールバックがより信頼性の高い 1 つの推論パスにつながります。
したがって、人間による注釈のない単純なプロンプトから始めて、TR を使用した LLM は、正しい解決策に向けて適応的かつ徐々に思考を探索します。
数学的問題とマルチタスク推論に関する包括的な実験により、問題解決速度と対話コストの観点から TR の最先端のパフォーマンスが実証されました。
たとえば、TR を使用した GPT-4 の解決率は、MATH データセットで現在の最高を $9\%$ 上回ります。

要約(オリジナル)

Large language models (LLMs) have been routinely used to solve various tasks using step-by-step reasoning. However, the structure of intermediate reasoning steps, or thoughts, is rigid and unidirectional, such as chains, trees, or acyclic-directed graphs. Consequently, the resulting inflexible and forward-only reasoning may not address challenging tasks and fail when the LLM frequently gives false responses, i.e., “hallucinations”. This paper proposes a new reasoning framework, called Thought Rollback (TR), allowing LLMs to adaptively build thought structure while maintaining effective reasoning toward problem-solving under “hallucinations”. The core mechanism of TR is rolling back thoughts, which allows LLMs to perform error analysis on thoughts, and thus roll back to any previously mistaken thought for revision. Subsequently, by including such trial-and-error in the prompt to guide the LLM, each rollback leads to one more reliable reasoning path. Therefore, starting with a simple prompt without human annotations, LLM with TR adaptively and gradually explores thoughts for a correct solution. Comprehensive experiments on mathematical problems and multi-task reasoning demonstrate the state-of-the-art performance of TR in terms of problem-solving rate and interaction cost. For instance, the solving rate of GPT-4 with TR outperforms the current best by $9\%$ on the MATH dataset.

arxiv情報

著者	Sijia Chen,Baochun Li
発行日	2024-12-27 16:02:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Toward Adaptive Reasoning in Large Language Models with Thought Rollback

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー