Understanding In-context Learning of Addition via Activation Subspaces

要約

コンテキスト内学習を実行するには、言語モデルは個々の少数のショット例から信号を抽出し、これらを学習した予測ルールに集約し、このルールを新しい例に適用する必要があります。
これは、最新の変圧器モデルのフォワードパスでどのように実装されていますか？
これを研究するために、真の予測ルールが入力に整数$ k $を追加することである少数のショット学習タスクの構造化されたファミリを検討します。
llama-3-8bは、$ k $の範囲でこのタスクで高精度を達成し、新しい最適化アプローチを介して3つの注意ヘッドのみに少数の能力をローカライズします。
さらに、抽出された信号が6次元のサブスペースにあることを示します。ここでは、4つの寸法がユニット桁を追跡し、他の2つのディメンションが全体の大きさを追跡します。
最後に、これらのヘッドが個々の少数のショット例から情報をどのように抽出するかを調べ、以前の例からの間違いが後の例で抑制される自己修正メカニズムを特定します。
我々の結果は、前方パスを横切る低次元サブスペースを追跡することで、細粒の計算構造に関する洞察を提供する方法を示しています。

要約(オリジナル)

To perform in-context learning, language models must extract signals from individual few-shot examples, aggregate these into a learned prediction rule, and then apply this rule to new examples. How is this implemented in the forward pass of modern transformer models? To study this, we consider a structured family of few-shot learning tasks for which the true prediction rule is to add an integer $k$ to the input. We find that Llama-3-8B attains high accuracy on this task for a range of $k$, and localize its few-shot ability to just three attention heads via a novel optimization approach. We further show the extracted signals lie in a six-dimensional subspace, where four of the dimensions track the unit digit and the other two dimensions track overall magnitude. We finally examine how these heads extract information from individual few-shot examples, identifying a self-correction mechanism in which mistakes from earlier examples are suppressed by later examples. Our results demonstrate how tracking low-dimensional subspaces across a forward pass can provide insight into fine-grained computational structures.

arxiv情報

著者	Xinyan Hu,Kayo Yin,Michael I. Jordan,Jacob Steinhardt,Lijie Chen
発行日	2025-05-08 11:32:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Understanding In-context Learning of Addition via Activation Subspaces

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー