Tackling Provably Hard Representative Selection via Graph Neural Networks

要約

代表選択 (RS) は、データセットを代表するサンプルの小さなサブセットをデータセットから見つける問題です。
この論文では、属性付きグラフの RS を研究し、選択された代表でトレーニングされたモデルの精度を最適化する代表ノードを見つけることに焦点を当てます。
理論的には、RS の特定の非常に実用的なバリアント (学習用 RS) は、合理的な係数内で多項式時間で近似することが困難であることを証明することで、RS の新しい硬度結果 (グラフ構造がない場合) を確立します。これは、重要な係数を意味します。
広く使用されている代理関数の最適解とモデルの実際の精度との間に潜在的なギャップ。
次に、(同種の) グラフ構造がデータポイント間で利用可能であるか、構築できる設定を研究します。適切なモデリングアプローチを使用すると、そのような構造の存在が (学習用の) 難しい RS 問題を解決できることを示します。
効果的に解決できるものにします。
この目的を達成するために、グラフニューラルネットワークに基づく表現学習ベースの RS モデルである RS-GNN を開発します。
RS-GNN が 8 つのベンチマークスイートで確立されたベースラインを超えて大幅な改善を達成することを示すことにより、事前定義されたグラフ構造の問題と、ノードの特徴の類似性に起因するグラフの問題に対する RS-GNN の有効性を経験的に実証します。

要約(オリジナル)

Representative Selection (RS) is the problem of finding a small subset of exemplars from a dataset that is representative of the dataset. In this paper, we study RS for attributed graphs, and focus on finding representative nodes that optimize the accuracy of a model trained on the selected representatives. Theoretically, we establish a new hardness result forRS (in the absence of a graph structure) by proving that a particular, highly practical variant of it (RS for Learning) is hard to approximate in polynomial time within any reasonable factor, which implies a significant potential gap between the optimum solution of widely-used surrogate functions and the actual accuracy of the model. We then study the setting where a (homophilous) graph structure is available, or can be constructed, between the data points.We show that with an appropriate modeling approach, the presence of such a structure can turn a hard RS (for learning) problem into one that can be effectively solved. To this end, we develop RS-GNN, a representation learning-based RS model based on Graph Neural Networks. Empirically, we demonstrate the effectiveness of RS-GNN on problems with predefined graph structures as well as problems with graphs induced from node feature similarities, by showing that RS-GNN achieves significant improvements over established baselines on a suite of eight benchmarks.

arxiv情報

著者	Mehran Kazemi,Anton Tsitsulin,Hossein Esfandiari,MohammadHossein Bateni,Deepak Ramachandran,Bryan Perozzi,Vahab Mirrokni
発行日	2023-07-19 14:23:17+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Tackling Provably Hard Representative Selection via Graph Neural Networks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー