Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning

要約

長い考え方（長期コット）の推論はLLMSの精度を向上させますが、その冗長で自己反射的なスタイルは、しばしば小さな言語モデル（SLM）への効果的な蒸留を妨げます。
能力アラインメントのレンズを介してロングコット圧縮を再検討し、次のように尋ねます。プルーニングは推論を改善できますか？
ロングコットをロジックグラフに変換し、自己検証の制約の下で低有効性の推論ステップを選択的に剪定する構造認識フレームワークであるPrune-on-Logicを提案します。
チェーン全体、コア推論、および検証をターゲットにした3つの剪定戦略にわたる体系的な分析により、剪定検証ステップにより、推論コストを削減し、トークンレベルのベースラインを上回り、非圧縮微調整を上回る一貫した精度の向上が得られます。
対照的に、剪定の推論またはオールチェーンステップはパフォーマンスを低下させ、小さなモデルが短いコットからではなく、意味的にleanせたコットからの恩恵を受けることを明らかにします。
私たちの調査結果は、COTの推論をSLM容量に合わせるための構造最適化戦略としての剪定を強調しています。

要約(オリジナル)

Long chain-of-thought (Long-CoT) reasoning improves accuracy in LLMs, yet its verbose, self-reflective style often hinders effective distillation into small language models (SLMs). We revisit Long-CoT compression through the lens of capability alignment and ask: Can pruning improve reasoning? We propose Prune-on-Logic, a structure-aware framework that transforms Long-CoT into logic graphs and selectively prunes low-utility reasoning steps under self-verification constraints. Through systematic analysis across three pruning strategies — targeting entire chains, core reasoning, and verification — we find that pruning verification steps yields consistent accuracy gains while reducing inference cost, outperforming token-level baselines and uncompressed fine-tuning. In contrast, pruning reasoning or all-chain steps degrades performance, revealing that small models benefit not from shorter CoTs, but from semantically leaner ones. Our findings highlight pruning as a structural optimization strategy for aligning CoT reasoning with SLM capacity.

arxiv情報

著者	Shangziqi Zhao,Jiahao Yuan,Guisong Yang,Usman Naseem
発行日	2025-05-20 16:38:32+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー