Maximum Likelihood Estimation on Stochastic Blockmodels for Directed Graph Clustering

要約

この論文では、有向グラフクラスタリング問題を統計学のレンズを通して研究します。そこでは、有向確率ブロックモデル (DSBM) の基礎となるコミュニティを推定するものとしてクラスタリングを定式化します。
DSBM に対して最尤推定 (MLE) を実行し、観察されたグラフ構造を考慮して最も可能性の高いコミュニティ割り当てを確認します。
統計的な観点に加えて、この MLE 定式化と、エッジ密度とエッジ方向という 2 つの重要な有向グラフ統計を共同で考慮する新しいフロー最適化ヒューリスティックとの間の同等性をさらに確立します。
有向クラスタリングのこの新しい定式化に基づいて、スペクトルクラスタリングアルゴリズムと半定値計画ベースのクラスタリングアルゴリズムという 2 つの効率的で解釈可能な有向クラスタリングアルゴリズムを導入します。
行列摂動理論のツールを使用して、スペクトルクラスタリングアルゴリズムのミスクラスタリングされた頂点の数の理論的な上限を提供します。
私たちは、合成データと現実世界のデータの両方について、提案したアルゴリズムを既存の有向クラスタリング手法と量的および定性の両方で比較し、これにより理論的貢献にさらなる根拠を提供します。

要約(オリジナル)

This paper studies the directed graph clustering problem through the lens of statistics, where we formulate clustering as estimating underlying communities in the directed stochastic block model (DSBM). We conduct the maximum likelihood estimation (MLE) on the DSBM and thereby ascertain the most probable community assignment given the observed graph structure. In addition to the statistical point of view, we further establish the equivalence between this MLE formulation and a novel flow optimization heuristic, which jointly considers two important directed graph statistics: edge density and edge orientation. Building on this new formulation of directed clustering, we introduce two efficient and interpretable directed clustering algorithms, a spectral clustering algorithm and a semidefinite programming based clustering algorithm. We provide a theoretical upper bound on the number of misclustered vertices of the spectral clustering algorithm using tools from matrix perturbation theory. We compare, both quantitatively and qualitatively, our proposed algorithms with existing directed clustering methods on both synthetic and real-world data, thus providing further ground to our theoretical contributions.

arxiv情報

著者	Mihai Cucuringu,Xiaowen Dong,Ning Zhang
発行日	2024-03-28 15:47:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Maximum Likelihood Estimation on Stochastic Blockmodels for Directed Graph Clustering

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー