NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders

要約

ニューラルドキュメントリランカーは、精度の点で非常に効果的です。
ただし、最高のモデルにはサービスを提供するための専用ハードウェアが必要であり、これはコストがかかり、多くの場合実現不可能です。
この提供時間の要件を回避するために、ドキュメントごとに Transformer の FLOP の 10 ～ 6% のみを必要とし、提供できる語彙化されたスコアリング関数を使用して、Transformer クロスアテンションモデルのゲインの最大 86% を取得する方法を紹介します。
市販の CPU を使用します。
BM25 レトリバーと組み合わせると、このアプローチは、クエリエンコード用のアクセラレータを必要とする最先端のデュアルエンコーダレトリバーの品質に匹敵します。
T5、GPT-3、PaLM など、最近のエンコーダーデコーダーおよびデコーダー専用の大規模言語モデルと互換性のあるモデルアーキテクチャとして、NAIL (Non-Autoregressive Indexing with Language models) を紹介します。
このモデルアーキテクチャは、既存の事前トレーニング済みチェックポイントを利用でき、クエリのニューラル処理を必要としないドキュメント表現を効率的に構築するために微調整できます。

要約(オリジナル)

Neural document rerankers are extremely effective in terms of accuracy. However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this serving-time requirement, we present a method of capturing up to 86% of the gains of a Transformer cross-attention model with a lexicalized scoring function that only requires 10-6% of the Transformer’s FLOPs per document and can be served using commodity CPUs. When combined with a BM25 retriever, this approach matches the quality of a state-of-the art dual encoder retriever, that still requires an accelerator for query encoding. We introduce NAIL (Non-Autoregressive Indexing with Language models) as a model architecture that is compatible with recent encoder-decoder and decoder-only large language models, such as T5, GPT-3 and PaLM. This model architecture can leverage existing pre-trained checkpoints and can be fine-tuned for efficiently constructing document representations that do not require neural processing of queries.

arxiv情報

著者	Livio Baldini Soares,Daniel Gillick,Jeremy R. Cole,Tom Kwiatkowski
発行日	2023-10-23 14:46:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー