Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures

要約

この論文では、一般的なスパーストランスフォーマーアーキテクチャである Longformer Encoder-Decoder の拡張を提案します。
スパーストランスフォーマーに共通する課題の 1 つは、文書の最初と最後で説明されているトピック間の接続など、長距離のコンテキストのエンコードに苦労する可能性があることです。
グローバルな注目を選択的に高める方法が、いくつかのベンチマークデータセットに対する抽象的な要約タスクに対して提案され、実証されています。
トランスクリプトに追加のキーワードを接頭辞として付け、これらのキーワードに対する世界的な注目をエンコードすることにより、一部のベンチマークデータセットでゼロショット、少数ショット、および微調整されたケースの改善が実証されています。

要約(オリジナル)

In this paper, we propose an extension to Longformer Encoder-Decoder, a popular sparse transformer architecture. One common challenge with sparse transformers is that they can struggle with encoding of long range context, such as connections between topics discussed at a beginning and end of a document. A method to selectively increase global attention is proposed and demonstrated for abstractive summarization tasks on several benchmark data sets. By prefixing the transcript with additional keywords and encoding global attention on these keywords, improvement in zero-shot, few-shot, and fine-tuned cases is demonstrated for some benchmark data sets.

arxiv情報

著者	Evan Lucas,Dylan Kangas,Timothy C Havens
発行日	2024-10-11 16:41:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー