Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions

要約

言語モデルは通常、左から右（L2R）の自己回帰因数分解を使用します。
ただし、L2R因数分解は常に最良の誘導バイアスではない場合があります。
したがって、テキスト分布の代替要因化が一部のタスクで有益であるかどうかを調査します。
知識の抽出と推論のためのテストベッドとして、多肢選択の質問（MCQ）に焦点を当てた、左から左へのトレーニングを説得力のある代替として調査します。
さまざまなモデルサイズ（2B-8Bパラメーター）とトレーニングデータセットにわたる広範な実験により、R2Lモデルは、論理的推論、常識の理解、真実性評価タスクなど、いくつかのMCQベンチマークでL2Rモデルを大幅に上回ることができます。
私たちの分析により、このパフォーマンスの違いは、キャリブレーション、計算可能性、方向性条件付きエントロピーなどの複数の要因に基本的にリンクされている可能性があることが明らかになりました。
影響する要因がよりよく解き放たれる可能性のある算術タスクを使用して、制御されたシミュレーション研究を通じてこれらの要因の影響を除去します。
私たちの研究は、テキスト分布の代替要因化を調査することでLLM機能の改善につながる可能性があり、人間の言語分布を近似するための最適な因数分解に関する理論的洞察を提供することを示しています。

要約(オリジナル)

Language models usually use left-to-right (L2R) autoregressive factorization. However, L2R factorization may not always be the best inductive bias. Therefore, we investigate whether alternative factorizations of the text distribution could be beneficial in some tasks. We investigate right-to-left (R2L) training as a compelling alternative, focusing on multiple-choice questions (MCQs) as a test bed for knowledge extraction and reasoning. Through extensive experiments across various model sizes (2B-8B parameters) and training datasets, we find that R2L models can significantly outperform L2R models on several MCQ benchmarks, including logical reasoning, commonsense understanding, and truthfulness assessment tasks. Our analysis reveals that this performance difference may be fundamentally linked to multiple factors including calibration, computability and directional conditional entropy. We ablate the impact of these factors through controlled simulation studies using arithmetic tasks, where the impacting factors can be better disentangled. Our work demonstrates that exploring alternative factorizations of the text distribution can lead to improvements in LLM capabilities and provides theoretical insights into optimal factorization towards approximating human language distribution, and when each reasoning order might be more advantageous.

arxiv情報

著者	Yizhe Zhang,Richard Bai,Zijin Gu,Ruixiang Zhang,Jiatao Gu,Emmanuel Abbe,Samy Bengio,Navdeep Jaitly
発行日	2025-02-25 18:30:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー