Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures

要約

大規模言語モデル (LLM) は、複雑な推論を段階的な解決策に分解する思考連鎖 (CoT) プロンプトによって主に駆動される、優れた数学的能力を実証しています。
GSM8K や MATH などのベンチマークでのパフォーマンスが証明しているように、このアプローチにより大幅な進歩が可能になりました。
しかし、CoT の単一ステップで算術演算を実行する LLM の能力の基礎となるメカニズムは、依然としてよく理解されていません。
既存の研究では、LLM が数値をエンコードするのか、記号推論に依存するのかが議論されていますが、他の研究では、算術タスクにおける注意と多層処理について調査されています。
この研究では、LLM が \emph{可換性} や \emph{Identity} 特性などの代数構造を捉えることによって算術を学習することを提案します。
これらの構造は入出力関係を通じて観察できるため、目に見えないデータに一般化できます。
私たちは、LLM が算術問題のカスタムデータセットを使用して代数構造を学習できることを経験的に示します。
私たちの調査結果は、代数構造を活用することで LLM の算術能力を向上させ、算術パフォーマンスを向上させるための洞察を提供できることを示しています。

要約(オリジナル)

Large language models (LLMs) have demonstrated remarkable mathematical capabilities, largely driven by chain-of-thought (CoT) prompting, which decomposes complex reasoning into step-by-step solutions. This approach has enabled significant advancements, as evidenced by performance on benchmarks like GSM8K and MATH. However, the mechanisms underlying LLMs’ ability to perform arithmetic in a single step of CoT remain poorly understood. Existing studies debate whether LLMs encode numerical values or rely on symbolic reasoning, while others explore attention and multi-layered processing in arithmetic tasks. In this work, we propose that LLMs learn arithmetic by capturing algebraic structures, such as \emph{Commutativity} and \emph{Identity} properties. Since these structures are observable through input-output relationships, they can generalize to unseen data. We empirically demonstrate that LLMs can learn algebraic structures using a custom dataset of arithmetic problems. Our findings indicate that leveraging algebraic structures can enhance the LLMs’ arithmetic capabilities, offering insights into improving their arithmetic performance.

arxiv情報

著者	Fu-Chieh Chang,Pei-Yuan Wu
発行日	2024-11-25 10:23:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー