TreeCoders: Trees of Transformers

要約

このペーパーでは、トランスフォーマーツリーの新しいファミリーである TreeCoders を紹介します。
私たちは従来の線形変換器から離れて、k 値ツリーを完成させました。
Transformer ブロックはノードとして機能し、汎用分類子は最適な子を選択してトークンのシーケンスを特定のリーフにルーティングする方法を学習します。
トランスブロックの外側に移動されたセレクターにより、さらに変更を加えることなく、さまざまなアーキテクチャを使用できるようになります。
さらに、私たちが提案するアーキテクチャは、ツリー検索の対数的複雑さによるスパースノードのアクティブ化をサポートします。
私たちは一連のデコーダー専用ツリートランスフォーマーをテストすることでアイデアを検証し、さまざまな言語データセットにわたって競争力のある結果を達成しました。
私たちの研究では、提案されたツリートランスフォーマモデルが、広範囲のツリーアーキテクチャにわたって、サイズが同等の線形トランスフォーマモデルよりも 76% の確率で優れていることが実証されました。
さらに、私たちが提案するモデルは、当然ながら分散実装にも適しています。

要約(オリジナル)

In this paper, we introduce TreeCoders, a novel family of transformer trees. We moved away from traditional linear transformers to complete k-ary trees. Transformer blocks serve as nodes, and generic classifiers learn to select the best child and route the sequence of tokens to a specific leaf. The selectors, moved outside the transformer blocks, allow for the use of a variety of architecture without further modifications. Furthermore, our proposed architecture supports sparse node activation due to the logarithmic complexity of a tree search. We validate our idea by testing a series of decoder-only tree transformers, achieving competitive results across a diverse range of language datasets. Our study demonstrates that the proposed tree transformer model outperforms a size-equivalent linear transformer model 76\% of the time over a wide range of tree architectures. Furthermore, our proposed model naturally lends itself to distributed implementation.

arxiv情報

著者	Pierre Colonna D’Istria,Abdulrahman Altahhan
発行日	2024-11-11 18:40:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

TreeCoders: Trees of Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー