Deep Clustering Using the Soft Silhouette Score: Towards Compact and Well-Separated Clusters

要約

教師なし学習はビッグデータ時代に注目を集めており、ラベルのないデータセットから貴重な洞察を抽出する手段を提供します。
ディープクラスタリングは重要な教師なしカテゴリとして浮上しており、クラスタリングのパフォーマンスを向上させるためにニューラルネットワークの非線形マッピング機能を活用することを目的としています。
ディープクラスタリングに関する文献の大部分は、学習された表現と元の高次元データセットの一貫性を維持しながら、一部の埋め込み空間におけるクラスター内部の変動を最小限に抑えることに焦点を当てています。
この研究では、シルエット係数の確率的定式化であるソフトシルエットを提案します。
ソフトシルエットは、従来のシルエット係数のような、コンパクトで明確に分離されたクラスタリングソリューションにメリットをもたらします。
ディープクラスタリングフレームワーク内で最適化すると、ソフトシルエットは、学習された表現をコンパクトで十分に分離されたクラスターの形成に導きます。
さらに、ソフトシルエット目的関数の最適化に適したオートエンコーダベースの深層学習アーキテクチャを紹介します。
提案されたディープクラスタリング手法は、さまざまなベンチマークデータセットでテストされ、よく研究されたいくつかのディープクラスタリング手法と比較され、非常に満足のいくクラスタリング結果が得られました。

要約(オリジナル)

Unsupervised learning has gained prominence in the big data era, offering a means to extract valuable insights from unlabeled datasets. Deep clustering has emerged as an important unsupervised category, aiming to exploit the non-linear mapping capabilities of neural networks in order to enhance clustering performance. The majority of deep clustering literature focuses on minimizing the inner-cluster variability in some embedded space while keeping the learned representation consistent with the original high-dimensional dataset. In this work, we propose soft silhoutte, a probabilistic formulation of the silhouette coefficient. Soft silhouette rewards compact and distinctly separated clustering solutions like the conventional silhouette coefficient. When optimized within a deep clustering framework, soft silhouette guides the learned representations towards forming compact and well-separated clusters. In addition, we introduce an autoencoder-based deep learning architecture that is suitable for optimizing the soft silhouette objective function. The proposed deep clustering method has been tested and compared with several well-studied deep clustering methods on various benchmark datasets, yielding very satisfactory clustering results.

arxiv情報

著者	Georgios Vardakas,Ioannis Papakostas,Aristidis Likas
発行日	2024-02-01 14:02:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Deep Clustering Using the Soft Silhouette Score: Towards Compact and Well-Separated Clusters

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー