Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU

要約

人間は言語を段階的に処理しますが、現在 NLP で使用されている最高の言語エンコーダーは段階的に言語を処理しません。
双方向 LSTM とトランスフォーマーはどちらも、エンコードされるシーケンスが完全に利用可能で、順方向および逆方向 (BiLSTM) または全体として (トランスフォーマー) のいずれかに処理できることを前提としています。
対話型システムで発生する可能性のある、特定のタイムステップまでに見られた部分的な入力に基づいて部分的な出力を提供する必要がある場合に、インクリメンタルインターフェイスの下でそれらがどのように動作するかを調査します。
さまざまな NLU データセットで 5 つのモデルをテストし、3 つの増分評価メトリクスを使用してパフォーマンスを比較します。
この結果は、双方向エンコーダの非インクリメンタル品質をほとんど維持しながら、インクリメンタルモードで双方向エンコーダを使用できる可能性を裏付けています。
より優れた非増分パフォーマンスを実現する「全方向性」BERT モデルは、増分アクセスの影響をより大きく受けます。
これは、適切なコンテキストが利用可能になるまで出力を遅らせることによって、または GPT-2 のような言語モデルによって生成された仮説的な適切なコンテキストを組み込むことによって、トレーニング計画 (切り詰められたトレーニング) またはテスト手順を適応させることによって軽減できます。

要約(オリジナル)

While humans process language incrementally, the best language encoders currently used in NLP do not. Both bidirectional LSTMs and Transformers assume that the sequence that is to be encoded is available in full, to be processed either forwards and backwards (BiLSTMs) or as a whole (Transformers). We investigate how they behave under incremental interfaces, when partial output must be provided based on partial input seen up to a certain time step, which may happen in interactive systems. We test five models on various NLU datasets and compare their performance using three incremental evaluation metrics. The results support the possibility of using bidirectional encoders in incremental mode while retaining most of their non-incremental quality. The ‘omni-directional’ BERT model, which achieves better non-incremental performance, is impacted more by the incremental access. This can be alleviated by adapting the training regime (truncated training), or the testing procedure, by delaying the output until some right context is available or by incorporating hypothetical right contexts generated by a language model like GPT-2.

arxiv情報

著者	Brielen Madureira,David Schlangen
発行日	2024-03-28 11:26:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー