Improving Neural Ranking Models with Traditional IR Methods

要約

大規模な変圧器モデルに基づくニューラルランキング手法は、最近、情報検索コミュニティで大きな注目を集めており、主要な商用ソリューションで採用されています。
それにもかかわらず、それらを作成するには計算コストがかかり、特殊なコーパスには大量のラベル付きデータが必要です。
この論文では、文書検索用のバグオブ埋め込みモデルである低リソースの代替案を検討し、情報検索タスクに合わせて微調整された大規模な変換モデルと競合できることを発見しました。
私たちの結果は、従来のキーワードマッチング手法である TF-IDF と浅い埋め込みモデルを単純に組み合わせることで、3 つのデータセットに対する複雑なニューラルランキングモデルのパフォーマンスと十分に競合するための低コストパスを提供することを示しています。
さらに、TF-IDF 対策を追加すると、これらのタスクにおける大規模な微調整モデルのパフォーマンスが向上します。

要約(オリジナル)

Neural ranking methods based on large transformer models have recently gained significant attention in the information retrieval community, and have been adopted by major commercial solutions. Nevertheless, they are computationally expensive to create, and require a great deal of labeled data for specialized corpora. In this paper, we explore a low resource alternative which is a bag-of-embedding model for document retrieval and find that it is competitive with large transformer models fine tuned on information retrieval tasks. Our results show that a simple combination of TF-IDF, a traditional keyword matching method, with a shallow embedding model provides a low cost path to compete well with the performance of complex neural ranking models on 3 datasets. Furthermore, adding TF-IDF measures improves the performance of large-scale fine tuned models on these tasks.

arxiv情報

著者	Anik Saha,Oktie Hassanzadeh,Alex Gittens,Jian Ni,Kavitha Srinivas,Bulent Yener
発行日	2023-08-29 05:18:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Improving Neural Ranking Models with Traditional IR Methods

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー