SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning

要約

テスト時間スケーリング（TTS）とは、モデルのパラメーターを変更せずに、推論中に追加の計算を割り当てることにより、推論パフォーマンスを改善するアプローチを指します。
既存のTTSメソッドは、より中間ステップを生成することにより離散トークン空間で動作しますが、ココナッツとソフトコットでの最近の研究は、連続潜在空間での考えることが推論パフォーマンスをさらに高めることができることを実証しています。
このような潜在的思考は、自己回帰トークンの生成に関連する情報損失なしに有益な思考をエンコードし、継続的なスペースの推論への関心を高めました。
繰り返されるサンプリングが多様な推論パスを探索できる離散デコードとは異なり、すべてのデコードされたパスが同じ潜在思考から発生するため、さまざまな入力に連続空間内の潜在的な表現が特定の入力に対して固定されています。
この制限を克服するために、SoftCot ++を導入して、思考パスの多様な調査を可能にすることにより、テスト時のスケーリングパラダイムにソフトコットを拡張します。
具体的には、複数の専門化された初期トークンを介して潜在的な思考を混乱させ、対照的な学習を適用して、柔らかい思考表現間の多様性を促進します。
5つの推論ベンチマークと2つの異なるLLMアーキテクチャにわたる実験は、ソフトコット++がソフトコットを大幅に高め、自己整合性スケーリングでソフトコットを上回ることを示しています。
さらに、自己整合性などの従来のスケーリング技術との強い互換性を示しています。
ソースコードは、https：//github.com/xuyige/softcotで入手できます。

要約(オリジナル)

Test-Time Scaling (TTS) refers to approaches that improve reasoning performance by allocating extra computation during inference, without altering the model’s parameters. While existing TTS methods operate in a discrete token space by generating more intermediate steps, recent studies in Coconut and SoftCoT have demonstrated that thinking in the continuous latent space can further enhance the reasoning performance. Such latent thoughts encode informative thinking without the information loss associated with autoregressive token generation, sparking increased interest in continuous-space reasoning. Unlike discrete decoding, where repeated sampling enables exploring diverse reasoning paths, latent representations in continuous space are fixed for a given input, which limits diverse exploration, as all decoded paths originate from the same latent thought. To overcome this limitation, we introduce SoftCoT++ to extend SoftCoT to the Test-Time Scaling paradigm by enabling diverse exploration of thinking paths. Specifically, we perturb latent thoughts via multiple specialized initial tokens and apply contrastive learning to promote diversity among soft thought representations. Experiments across five reasoning benchmarks and two distinct LLM architectures demonstrate that SoftCoT++ significantly boosts SoftCoT and also outperforms SoftCoT with self-consistency scaling. Moreover, it shows strong compatibility with conventional scaling techniques such as self-consistency. Source code is available at https://github.com/xuyige/SoftCoT.

arxiv情報

著者	Yige Xu,Xu Guo,Zhiwei Zeng,Chunyan Miao
発行日	2025-05-16 17:47:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー