Associative Transformer Is A Sparse Representation Learner

要約

従来の Transformer モデルのモノリシックなペアワイズアテンションメカニズムから出現して、生物学的原理により密接に一致する疎な相互作用を活用することへの関心が高まっています。
Set Transformer や Perceiver を含むアプローチでは、能力が限られた注意のボトルネックを形成する潜在スペースと統合された相互注意を使用します。
グローバルワークスペース理論と連想記憶に関する最近の神経科学研究に基づいて、私たちは連想トランスフォーマー (AiT) を提案します。
AiT は、共有ワークスペース内のボトルネックの注意を導くための事前分布と、ホップフィールドネットワークの連想メモリ内のアトラクターの両方として機能する低ランクの明示的メモリを誘導します。
エンドツーエンドの共同トレーニングを通じて、これらの事前学習者はモジュールの専門化を自然に開発し、それぞれが注意のボトルネックを形成する明確な誘導バイアスに寄与します。
ボトルネックにより、メモリに情報を書き込むための入力間の競合が促進される可能性があります。
AiT がスパース表現学習器であり、入力量と次元に対して複雑さが不変であるボトルネックを通じて個別の事前分布を学習することを示します。
AiT は、さまざまな視覚タスクにおいて、Set Transformer、Vision Transformer、Coordination などの手法よりも優れていることを示します。

要約(オリジナル)

Emerging from the monolithic pairwise attention mechanism in conventional Transformer models, there is a growing interest in leveraging sparse interactions that align more closely with biological principles. Approaches including the Set Transformer and the Perceiver employ cross-attention consolidated with a latent space that forms an attention bottleneck with limited capacity. Building upon recent neuroscience studies of Global Workspace Theory and associative memory, we propose the Associative Transformer (AiT). AiT induces low-rank explicit memory that serves as both priors to guide bottleneck attention in the shared workspace and attractors within associative memory of a Hopfield network. Through joint end-to-end training, these priors naturally develop module specialization, each contributing a distinct inductive bias to form attention bottlenecks. A bottleneck can foster competition among inputs for writing information into the memory. We show that AiT is a sparse representation learner, learning distinct priors through the bottlenecks that are complexity-invariant to input quantities and dimensions. AiT demonstrates its superiority over methods such as the Set Transformer, Vision Transformer, and Coordination in various vision tasks.

arxiv情報

著者	Yuwei Sun,Hideya Ochiai,Zhirong Wu,Stephen Lin,Ryota Kanai
発行日	2023-09-22 13:37:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Associative Transformer Is A Sparse Representation Learner

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー