Faster Approx. Top-K: Harnessing the Full Power of Two Stages

要約

アレイから最大の$ k $要素を識別することを目的とするトップ$ k $の選択問題を検討します。
多くの機械学習アルゴリズムでトップ$ k $の選択が発生し、多くの場合、アクセラレータのボトルネックになり、高密度のマトリックス乗算に最適化されています。
この問題に対処するために、\ citet {chern2022tpuknnnearestneighbor}は、高速2段階の\ textit {近似}トップ$ $ k $アルゴリズムを提案しました。
このホワイトペーパーでは、このアルゴリズムの一般化バージョンを検討します。最初の段階では、各パーティションから約1 \ leq k ‘\ leq k $でトップ$ k’ $要素を選択します。
私たちの貢献は次のとおりです。（i）この一般化されたアルゴリズムの予想されるリコールの式を導き出し、第1段階でのパーティションを少なくする$ k ‘> 1 $を選択すると、元のアルゴリズムと同じ予想リコールを維持しながら、2番目の段階への入力サイズをより効果的に削減することを示しています。
\ citet {chern2022tpuknnnknearestneighbor}は、その論文の1つよりも$ 2 $ $ 2 $ $ 2 $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $で、リクルを犠牲にすることなく元のアルゴリズムよりも約1桁スピードアップを実現します。

要約(オリジナル)

We consider the Top-$K$ selection problem, which aims to identify the largest-$K$ elements from an array. Top-$K$ selection arises in many machine learning algorithms and often becomes a bottleneck on accelerators, which are optimized for dense matrix multiplications. To address this problem, \citet{chern2022tpuknnknearestneighbor} proposed a fast two-stage \textit{approximate} Top-$K$ algorithm: (i) partition the input array and select the top-$1$ element from each partition, (ii) sort this \textit{smaller subset} and return the top $K$ elements. In this paper, we consider a generalized version of this algorithm, where the first stage selects top-$K’$ elements, for some $1 \leq K’ \leq K$, from each partition. Our contributions are as follows: (i) we derive an expression for the expected recall of this generalized algorithm and show that choosing $K’ > 1$ with fewer partitions in the first stage reduces the input size to the second stage more effectively while maintaining the same expected recall as the original algorithm, (ii) we derive a bound on the expected recall for the original algorithm in \citet{chern2022tpuknnknearestneighbor} that is provably tighter by a factor of $2$ than the one in that paper, and (iii) we implement our algorithm on Cloud TPUv5e and achieve around an order of magnitude speedups over the original algorithm without sacrificing recall on real-world tasks.

arxiv情報

著者	Yashas Samaga,Varun Yerram,Spandana Raj Babbula,Prateek Jain,Praneeth Netrapalli
発行日	2025-06-04 17:04:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Faster Approx. Top-K: Harnessing the Full Power of Two Stages

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー