GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models

要約

マルチモーダルモデルは、画像、ビデオ、オーディオモダリティからの情報をうまく統合していますが、グラフモダリティを大規模言語モデル (LLM) に統合することは未開発のままです。
この不一致は主に、構造化グラフデータと非構造化テキストデータ間の固有の相違に起因します。
グラフの知識を組み込むことで信頼できる情報源が提供され、幻覚やドメイン知識の欠如など、テキスト生成の問題に対処する潜在的な解決策が可能になります。
グラフ知識の言語モデルへの統合を評価するには、専用のデータセットが必要です。
ただし、現時点では、マルチモーダルグラフ言語モデル専用に設計されたベンチマークデータセットはありません。
このギャップに対処するために、グラフ言語モデルの評価と将来の開発を容易にするために、ウィキデータから取得されたペアのサブグラフを含む質問応答データセットである GraphextQA を提案します。
さらに、CrossGNN と呼ばれるベースラインモデルを導入します。これは、デコード時に質問認識グラフの特徴を相互に参加させることによって、ペアのグラフでの回答生成を条件付けします。
提案されたデータセットは、グラフ言語モデルのグラフを理解する能力を評価し、それを回答生成に利用するように設計されています。
言語のみのモデルと提案されたグラフ言語モデルを使用して実験を実行し、ペアになったグラフの有用性を検証し、タスクの難しさを実証します。

要約(オリジナル)

While multi-modal models have successfully integrated information from image, video, and audio modalities, integrating graph modality into large language models (LLMs) remains unexplored. This discrepancy largely stems from the inherent divergence between structured graph data and unstructured text data. Incorporating graph knowledge provides a reliable source of information, enabling potential solutions to address issues in text generation, e.g., hallucination, and lack of domain knowledge. To evaluate the integration of graph knowledge into language models, a dedicated dataset is needed. However, there is currently no benchmark dataset specifically designed for multimodal graph-language models. To address this gap, we propose GraphextQA, a question answering dataset with paired subgraphs, retrieved from Wikidata, to facilitate the evaluation and future development of graph-language models. Additionally, we introduce a baseline model called CrossGNN, which conditions answer generation on the paired graphs by cross-attending question-aware graph features at decoding. The proposed dataset is designed to evaluate graph-language models’ ability to understand graphs and make use of it for answer generation. We perform experiments with language-only models and the proposed graph-language model to validate the usefulness of the paired graphs and to demonstrate the difficulty of the task.

arxiv情報

著者	Yuanchun Shen,Ruotong Liao,Zhen Han,Yunpu Ma,Volker Tresp
発行日	2023-10-12 16:46:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー