Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning

要約

大規模な推論モデル（LRMS）は、複雑な問題解決における印象的な能力を実証していますが、その内部推論メカニズムは依然としてよく理解されていません。
この論文では、情報理論的な観点からLRMの推論の軌跡を調査します。
LRMの推論中に中間表現と正解の間の相互情報（MI）がどのように進化するかを追跡することにより、興味深いMIピーク現象を観察します。特定の生成ステップでのMIは、LRMの推論プロセス中に突然の大幅な増加を示します。
このような現象を理論的に分析し、MIが増加するにつれて、モデルの予測誤差の確率が低下することを示しています。
さらに、これらのMIピークは、「うーん」、「待機」、「したがって」、「したがって」、これを思考トークンと呼ぶトークンを反射または移行を表現するトークンにしばしば対応します。
次に、これらの思考トークンがLRMの推論パフォーマンスに不可欠であることを実証しますが、他のトークンには最小限の影響があります。
これらの分析に基づいて、これらの思考トークンを繊細に活用することにより、LRMの推論パフォーマンスを改善するための2つのシンプルで効果的な方法を提案します。
全体として、私たちの作品は、LRMSの推論メカニズムに関する新しい洞察を提供し、推論能力を向上させる実用的な方法を提供します。
このコードはhttps://github.com/chnq/mi-peaksで入手できます。

要約(オリジナル)

Large reasoning models (LRMs) have demonstrated impressive capabilities in complex problem-solving, yet their internal reasoning mechanisms remain poorly understood. In this paper, we investigate the reasoning trajectories of LRMs from an information-theoretic perspective. By tracking how mutual information (MI) between intermediate representations and the correct answer evolves during LRM reasoning, we observe an interesting MI peaks phenomenon: the MI at specific generative steps exhibits a sudden and significant increase during LRM’s reasoning process. We theoretically analyze such phenomenon and show that as MI increases, the probability of model’s prediction error decreases. Furthermore, these MI peaks often correspond to tokens expressing reflection or transition, such as “Hmm”, “Wait” and “Therefore,” which we term as the thinking tokens. We then demonstrate that these thinking tokens are crucial for LRM’s reasoning performance, while other tokens has minimal impacts. Building on these analyses, we propose two simple yet effective methods to improve LRM’s reasoning performance, by delicately leveraging these thinking tokens. Overall, our work provides novel insights into the reasoning mechanisms of LRMs and offers practical ways to improve their reasoning capabilities. The code is available at https://github.com/ChnQ/MI-Peaks.

arxiv情報

著者	Chen Qian,Dongrui Liu,Haochen Wen,Zhen Bai,Yong Liu,Jing Shao
発行日	2025-06-04 15:00:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー