MLPs Learn In-Context on Regression and Classification Tasks

要約

入力のみの模範からタスクを解決する顕著な能力であるコンテキスト学習（ICL）は、多くの場合、変圧器モデルのユニークな特徴であると想定されます。
一般的に採用されている合成ICLタスクを調べることにより、多層パーセプロン（MLP）もコンテキスト内を学習できることを実証します。
さらに、MLPSおよび密接に関連するMLPミキサーモデルは、この設定で同じ計算予算の下で変圧器と同等のコンテキストで学習します。
さらに、MLPは、コンテキスト内分類に密接に関連するリレーショナル推論をテストするように設計された心理学からの一連の古典的なタスクでトランスを上回ることを示します。
これらの結果は、注意ベースのアーキテクチャを超えてコンテキスト内学習を研究する必要性を強調し、同時に、リレーショナルタスクを解決するMLPの能力に対する以前の議論にも挑戦しています。
全体として、私たちの結果は、合成環境でのMLPの予期せぬ能力を強調し、トランスアーキテクチャのAll-MLPの代替案に対する関心の高まりをサポートしています。
実際のタスクでMLPが大規模なトランスに対してどのように機能するか、そしてパフォーマンスギャップがどこから発生するかは不明のままです。
注意ベースのスキームの潜在的な比較利点をよりよく理解するために、より複雑な設定でのこれらのアーキテクチャのさらなる調査をお勧めします。

要約(オリジナル)

In-context learning (ICL), the remarkable ability to solve a task from only input exemplars, is often assumed to be a unique hallmark of Transformer models. By examining commonly employed synthetic ICL tasks, we demonstrate that multi-layer perceptrons (MLPs) can also learn in-context. Moreover, MLPs, and the closely related MLP-Mixer models, learn in-context comparably with Transformers under the same compute budget in this setting. We further show that MLPs outperform Transformers on a series of classical tasks from psychology designed to test relational reasoning, which are closely related to in-context classification. These results underscore a need for studying in-context learning beyond attention-based architectures, while also challenging prior arguments against MLPs’ ability to solve relational tasks. Altogether, our results highlight the unexpected competence of MLPs in a synthetic setting, and support the growing interest in all-MLP alternatives to Transformer architectures. It remains unclear how MLPs perform against Transformers at scale on real-world tasks, and where a performance gap may originate. We encourage further exploration of these architectures in more complex settings to better understand the potential comparative advantage of attention-based schemes.

arxiv情報

著者	William L. Tong,Cengiz Pehlevan
発行日	2025-02-25 16:27:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MLPs Learn In-Context on Regression and Classification Tasks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー