Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

要約

デュアルエンコーダーモデルは、最新の分類と検索ではどこにでもあります。
このようなデュアルエンコーダーのトレーニングに重要なのは、大きな出力空間でのソフトマックスの分配関数から勾配を正確に推定することです。
これには、最も大きく寄与するネガティブターゲット (「ハードネガティブ」) を見つける必要があります。
デュアルエンコーダーモデルのパラメーターはトレーニング中に変化するため、従来の静的最近傍インデックスの使用は最適ではない可能性があります。
これらの静的インデックスでは、(1) 定期的にインデックスを再構築する必要があり、その結果、(2) 更新されたモデルパラメータを使用してすべてのターゲットを再エンコードする必要があります。
このホワイトペーパーでは、これらの両方の課題に対処します。
最初に、ツリー構造を使用してソフトマックスを証明可能な境界で近似し、ツリーを動的に維持するアルゴリズムを紹介します。
次に、効率的な Nystrom 低ランク近似を使用して、ターゲットエンコーディングに対する勾配更新の効果を概算します。
2,000 万を超えるターゲットを持つデータセットに関する実証的研究では、オラクルブルートフォースネガティブマイニングに関連して、当社のアプローチによりエラーが半分に削減されました。
さらに、私たちの方法は、150 分の 1 のアクセラレータメモリを使用しながら、従来の最先端技術を凌駕しています。

要約(オリジナル)

Dual encoder models are ubiquitous in modern classification and retrieval. Crucial for training such dual encoders is an accurate estimation of gradients from the partition function of the softmax over the large output space; this requires finding negative targets that contribute most significantly (‘hard negatives’). Since dual encoder model parameters change during training, the use of traditional static nearest neighbor indexes can be sub-optimal. These static indexes (1) periodically require expensive re-building of the index, which in turn requires (2) expensive re-encoding of all targets using updated model parameters. This paper addresses both of these challenges. First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree. Second, we approximate the effect of a gradient update on target encodings with an efficient Nystrom low-rank approximation. In our empirical study on datasets with over twenty million targets, our approach cuts error by half in relation to oracle brute-force negative mining. Furthermore, our method surpasses prior state-of-the-art while using 150x less accelerator memory.

arxiv情報

著者	Nicholas Monath,Manzil Zaheer,Kelsey Allen,Andrew McCallum
発行日	2023-03-27 15:18:32+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー