CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

要約

タイトル：CodeIE：大規模なコード生成モデルは、フューショット情報抽出器として優れている

要約：
– 大規模言語モデル（LLMs）は、大規模なコーパスで事前学習されることで、多くのNLPタスクにおいて印象的なフューショット学習能力を発揮しています。
– NL-LLMs（自然言語の生成LLMs）のような自然言語の生成モデルに誘導されるように、タスクをテキスト-テキスト形式に再構成することが一般的です。
– しかし、IEタスクをNL-LLMsで行うのは非常に困難であるため、本論文では出力を自然言語の代わりにコード形式で再構成し、名前実体認識や関係の抽出などのIEタスクをCode-LLMsで実行することを提案しています。
– Code-LLMs（コードの生成LLMs）を使うことで、コードスタイルのプロンプトを設計し、これらのIEタスクをコード生成タスクとして定式化することができます。
– ７つのベンチマーク実験結果から、Code-LLMsを使うことで、IEタスクを専用に設計された中程度の事前学習モデル（UIEなど）よりも高い性能を実現することができることが示された。
– Code-LLMsをIEタスクに利用する利点を証明するために、詳細な分析を行いました。

要約(オリジナル)

Large language models (LLMs) pre-trained on massive corpora have demonstrated impressive few-shot learning ability on many NLP tasks. A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it. However, it is nontrivial to perform information extraction (IE) tasks with NL-LLMs since the output of the IE task is usually structured and therefore is hard to be converted into plain text. In this paper, we propose to recast the structured output in the form of code instead of natural language and utilize generative LLMs of code (Code-LLMs) such as Codex to perform IE tasks, in particular, named entity recognition and relation extraction. In contrast to NL-LLMs, we show that Code-LLMs can be well-aligned with these IE tasks by designing code-style prompts and formulating these IE tasks as code generation tasks. Experiment results on seven benchmarks show that our method consistently outperforms fine-tuning moderate-size pre-trained models specially designed for IE tasks (e.g., UIE) and prompting NL-LLMs under few-shot settings. We further conduct a series of in-depth analyses to demonstrate the merits of leveraging Code-LLMs for IE tasks.

arxiv情報

著者	Peng Li,Tianxiang Sun,Qiong Tang,Hang Yan,Yuanbin Wu,Xuanjing Huang,Xipeng Qiu
発行日	2023-05-11 01:27:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー