Transformer Alignment in Large Language Models

要約

大規模言語モデル (LLM) は自然言語処理において大きな進歩を遂げており、その成功を促進する内部メカニズムを正確に理解することが不可欠です。
私たちは LLM を、高次元の離散的、結合的、非線形、動的システムを介して変換するエンベディングとみなします。
この観点は、個々のトークンが変換ブロックを通過する際の軌道を追跡し、ヤコビアン行列を通じてこれらの軌道に沿ってシステムを線形化する動機になります。
38 個の公開されている LLM の分析では、残差ヤコビアンの左上と右上の特異ベクトルの配列、および線形性と層ごとの指数関数的成長の出現を明らかにしました。
特に、アライメントの増加 $\textit{正の相関}$ がモデルのパフォーマンスと相関していることがわかりました。
トレーニング後に評価されたメトリクスは、ランダムに初期化された重みで行われた測定と比較して大幅な改善を示しており、トランスフォーマーでのトレーニングの顕著な効果が強調されています。
これらの発見は、これまで見落とされていた驚くべきレベルの規則性を明らかにし、動的解釈を強化し、LLM アーキテクチャのより深い理解と最適化への道を開きます。

要約(オリジナル)

Large Language Models (LLMs) have made significant strides in natural language processing, and a precise understanding of the internal mechanisms driving their success is essential. We regard LLMs as transforming embeddings via a discrete, coupled, nonlinear, dynamical system in high dimensions. This perspective motivates tracing the trajectories of individual tokens as they pass through transformer blocks, and linearizing the system along these trajectories through their Jacobian matrices. In our analysis of 38 openly available LLMs, we uncover the alignment of top left and right singular vectors of Residual Jacobians, as well as the emergence of linearity and layer-wise exponential growth. Notably, we discover that increased alignment $\textit{positively correlates}$ with model performance. Metrics evaluated post-training show significant improvement in comparison to measurements made with randomly initialized weights, highlighting the significant effects of training in transformers. These findings reveal a remarkable level of regularity that has previously been overlooked, reinforcing the dynamical interpretation and paving the way for deeper understanding and optimization of LLM architectures.

arxiv情報

著者	Murdock Aubry,Haoming Meng,Anton Sugolov,Vardan Papyan
発行日	2024-07-10 16:30:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Transformer Alignment in Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー