OneBatchPAM: A Fast and Frugal K-Medoids Algorithm

要約

このペーパーでは、合理的な計算時間とメモリの複雑さで大規模なデータセットを処理するための新しいK-Medoids近似アルゴリズムを提案します。
K-Medoids目標の推定に基づいてメドイド選択を改善するローカル検索アルゴリズムを開発します。
サイズm << nの単一のバッチは、ほとんどのKメドイドベースラインと比較して、O（n^2）ではなく、O（n^2）ではなく、必要なメモリサイズとペアワイズ類似の計算の数をO（MN）にo（mn）にo（mn）に除去する推定を提供します。サイズm = o（log（n））のバッチが、元のローカル検索アルゴリズムと同じパフォーマンスを保証するのに十分であることを強調して、理論的結果を得ます。さまざまなサイズと寸法の実際のデータセットで実施された複数の実験は、私たちのアルゴリズムが、劇的に短縮された走行時間を伴うFasterpamやBanditpam ++などの最先端の方法と同様のパフォーマンスを提供することを示しています。

要約(オリジナル)

This paper proposes a novel k-medoids approximation algorithm to handle large-scale datasets with reasonable computational time and memory complexity. We develop a local-search algorithm that iteratively improves the medoid selection based on the estimation of the k-medoids objective. A single batch of size m << n provides the estimation, which reduces the required memory size and the number of pairwise dissimilarities computations to O(mn), instead of O(n^2) compared to most k-medoids baselines. We obtain theoretical results highlighting that a batch of size m = O(log(n)) is sufficient to guarantee, with strong probability, the same performance as the original local-search algorithm. Multiple experiments conducted on real datasets of various sizes and dimensions show that our algorithm provides similar performances as state-of-the-art methods such as FasterPAM and BanditPAM++ with a drastically reduced running time.

arxiv情報

著者	Antoine de Mathelin,Nicolas Enrique Cecchi,François Deheeger,Mathilde Mougeot,Nicolas Vayatis
発行日	2025-01-31 16:48:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

OneBatchPAM: A Fast and Frugal K-Medoids Algorithm

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー