Modality-Independent Graph Neural Networks with Global Transformers for Multimodal Recommendation

要約

マルチモーダルレコメンデーションシステムは、アイテムに関連付けられたマルチモーダルデータのセマンティクスだけでなく、既存のユーザーとアイテムのインタラクションからユーザーの好みを学習できます。
既存の手法の多くは、マルチモーダルなユーザー項目グラフを通じてこれをモデル化し、グラフ学習タスクとしてマルチモーダルなレコメンデーションにアプローチします。
グラフニューラルネットワーク (GNN) は、この分野で有望なパフォーマンスを示しています。
これまでの研究では、ユーザーとアイテムのセマンティクスを強化するために、特定の受容フィールド (通常はホップ数、$K$ で示される) 内の近隣情報を取得する GNN の機能が活用されてきました。
私たちは、GNN の最適な受容野がモダリティごとに異なる可能性があることを観察しています。
この論文では、モダリティに依存しない受容野を持つ GNN を提案します。これは、性能を向上させるために、異なるモダリティに対して独立した受容野を持つ個別の GNN を採用します。
私たちの結果は、特定のデータセット上の特定のモダリティに対する最適な $K$ が 1 または 2 と低い可能性があり、これによりグローバル情報を取得する GNN の能力が制限される可能性があることを示しています。
これに対処するために、均一なグローバルサンプリングを利用して GNN のグローバル情報を効果的に統合する、サンプリングベースのグローバルトランスフォーマーを導入します。
私たちは、既存の手法に対する私たちのアプローチの優位性を実証するための包括的な実験を実施します。
私たちのコードは https://github.com/CrawlScript/MIG-GT で公開されています。

要約(オリジナル)

Multimodal recommendation systems can learn users’ preferences from existing user-item interactions as well as the semantics of multimodal data associated with items. Many existing methods model this through a multimodal user-item graph, approaching multimodal recommendation as a graph learning task. Graph Neural Networks (GNNs) have shown promising performance in this domain. Prior research has capitalized on GNNs’ capability to capture neighborhood information within certain receptive fields (typically denoted by the number of hops, $K$) to enrich user and item semantics. We observe that the optimal receptive fields for GNNs can vary across different modalities. In this paper, we propose GNNs with Modality-Independent Receptive Fields, which employ separate GNNs with independent receptive fields for different modalities to enhance performance. Our results indicate that the optimal $K$ for certain modalities on specific datasets can be as low as 1 or 2, which may restrict the GNNs’ capacity to capture global information. To address this, we introduce a Sampling-based Global Transformer, which utilizes uniform global sampling to effectively integrate global information for GNNs. We conduct comprehensive experiments that demonstrate the superiority of our approach over existing methods. Our code is publicly available at https://github.com/CrawlScript/MIG-GT.

arxiv情報

著者	Jun Hu,Bryan Hooi,Bingsheng He,Yinwei Wei
発行日	2024-12-18 16:12:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Modality-Independent Graph Neural Networks with Global Transformers for Multimodal Recommendation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー