CodeKGC: Code Language Model for Generative Knowledge Graph Construction

要約

現在の生成的ナレッジグラフ構築アプローチでは、通常、自然言語をシリアル化されたテキストまたは仕様言語に単純に平坦化するだけでは構造的知識を捉えることができません。
ただし、コードなどの構造化データでトレーニングされた大規模な生成言語モデルは、構造予測および推論タスクのための自然言語を理解する上で優れた能力を実証しています。
直観的に、コード言語モデルを使用して生成ナレッジグラフ構築のタスクに取り組みます。コード形式の自然言語入力が与えられた場合、目標は、コード補完タスクとして表現できるトリプルを生成することです。
具体的には、ナレッジグラフ内のセマンティック構造を効果的に利用するスキーマ認識プロンプトを開発します。
コードはクラスや関数の定義などの構造を本質的に持っているため、事前の意味構造知識の有用なモデルとして機能します。
さらに、合理性を強化した生成方法を採用し、パフォーマンスを向上させています。
根拠は中間ステップを提供するため、知識抽出能力が向上します。
実験結果は、提案されたアプローチがベースラインと比較してベンチマークデータセットでより良いパフォーマンスを得ることができることを示しています。
コードとデータセットは https://github.com/zjunlp/DeepKE/tree/main/example/llm で入手できます。

要約(オリジナル)

Current generative knowledge graph construction approaches usually fail to capture structural knowledge by simply flattening natural language into serialized texts or a specification language. However, large generative language model trained on structured data such as code has demonstrated impressive capability in understanding natural language for structural prediction and reasoning tasks. Intuitively, we address the task of generative knowledge graph construction with code language model: given a code-format natural language input, the target is to generate triples which can be represented as code completion tasks. Specifically, we develop schema-aware prompts that effectively utilize the semantic structure within the knowledge graph. As code inherently possesses structure, such as class and function definitions, it serves as a useful model for prior semantic structural knowledge. Furthermore, we employ a rationale-enhanced generation method to boost the performance. Rationales provide intermediate steps, thereby improving knowledge extraction abilities. Experimental results indicate that the proposed approach can obtain better performance on benchmark datasets compared with baselines. Code and datasets are available in https://github.com/zjunlp/DeepKE/tree/main/example/llm.

arxiv情報

著者	Zhen Bi,Jing Chen,Yinuo Jiang,Feiyu Xiong,Wei Guo,Huajun Chen,Ningyu Zhang
発行日	2024-01-18 16:14:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CodeKGC: Code Language Model for Generative Knowledge Graph Construction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー