Forming Trees with Treeformers

要約

人間の言語は入れ子の階層構造を示し、小さな断片から複雑な文を形成できることが知られています。
ただし、Transformer などの多くの最先端のニューラルネットワークモデルには、そのアーキテクチャに明示的な階層構造がありません。つまり、階層構造に対する帰納的なバイアスがありません。
さらに、トランスフォーマーは、そのような構造を必要とする構成一般化タスクではパフォーマンスが低いことが知られています。
この論文では、CKY アルゴリズムに触発された汎用エンコーダモジュールである Treeformer を紹介します。これは、合成演算子とプーリング関数を学習して、フレーズや文の階層エンコーディングを構築します。
私たちの広範な実験は、階層構造を Transformer に組み込む利点を実証し、構成的一般化だけでなく、機械翻訳、抽象的要約、さまざまな自然言語理解タスクなどの下流タスクでも大幅な改善が見られることを示しています。

要約(オリジナル)

Human language is known to exhibit a nested, hierarchical structure, allowing us to form complex sentences out of smaller pieces. However, many state-of-the-art neural networks models such as Transformers have no explicit hierarchical structure in its architecture — that is, they don’t have an inductive bias toward hierarchical structure. Additionally, Transformers are known to perform poorly on compositional generalization tasks which require such structures. In this paper, we introduce Treeformer, a general-purpose encoder module inspired by the CKY algorithm which learns a composition operator and pooling function to construct hierarchical encodings for phrases and sentences. Our extensive experiments demonstrate the benefits of incorporating hierarchical structure into the Transformer and show significant improvements in compositional generalization as well as in downstream tasks such as machine translation, abstractive summarization, and various natural language understanding tasks.

arxiv情報

著者	Nilay Patel,Jeffrey Flanigan
発行日	2023-07-10 21:02:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Forming Trees with Treeformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー