Reducing Reasoning Costs — The Path of Optimization for Chain of Thought via Sparse Attention Mechanism

要約

大規模な言語モデルの推論コストの急増における思考連鎖に対処するために、この研究では、少数の関連トークンのみに焦点を当てたスパースアテンションメカニズムを使用することを提案しています。
研究者は新しい注意メカニズムを構築し、カスタム GPT で訓練された GiantRabbit を実験ツールとして使用しました。
実験では、MIT OpenCourseWare の線形代数試験問題を解く際に、このモデルと o1 Preview の推論時間、正しさスコア、思考連鎖の長さをテストし、比較しました。
結果は、GiantRabbit の推論時間と思考連鎖の長さが o1 プレビューよりも大幅に短いことを示しています。
これは、思考推論の連鎖を最適化するためのまばらな注意メカニズムの実現可能性を検証します。
アーキテクチャの詳細と実験プロセスの詳細は Github にアップロードされています。リンクは https://github.com/brucewang123456789/GeniusTrail.git です。

要約(オリジナル)

In order to address the chain of thought in the large language model inference cost surge, this research proposes to use a sparse attention mechanism that only focuses on a few relevant tokens. The researcher constructed a new attention mechanism and used GiantRabbit trained with custom GPTs as an experimental tool. The experiment tested and compared the reasoning time, correctness score and chain of thought length of this model and o1 Preview in solving the linear algebra test questions of MIT OpenCourseWare. The results show that GiantRabbit’s reasoning time and chain of thought length are significantly lower than o1 Preview. It verifies the feasibility of sparse attention mechanism for optimizing chain of thought reasoning. Detailed architectural details and experimental process have been uploaded to Github, the link is:https://github.com/brucewang123456789/GeniusTrail.git.

arxiv情報

著者	Libo Wang
発行日	2024-12-11 18:50:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Reducing Reasoning Costs — The Path of Optimization for Chain of Thought via Sparse Attention Mechanism

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー