Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge

要約

ソーシャルメディア上のマルチモーダル固有表現認識 (MNER) は、画像ベースの手掛かりを組み込むことでテキスト実体の予測を強化することを目的としています。
既存の研究は主に、関連する画像情報を最大限に利用すること、または形式的知識ベースからの外部知識を組み込むことに焦点を当てています。
ただし、これらの方法では、モデルに外部知識を提供する必要性が無視されているか、取得された知識の冗長性が高いという問題が発生します。
この論文では、ChatGPT を暗黙的な知識ベースとして活用し、より効率的なエンティティ予測のための補助知識をヒューリスティックに生成できるようにすることを目的とした 2 段階のフレームワークである PGIM を紹介します。
具体的には、PGIM には、事前に定義された少数の人工サンプルから適切なサンプルを選択するマルチモーダル類似サンプル認識モジュールが含まれています。
これらの例は、MNER に合わせてフォーマットされたプロンプトテンプレートに統合され、ChatGPT が補助的な洗練された知識を生成するようにガイドされます。
最後に、取得した知識は元のテキストと統合され、さらなる処理のために下流のモデルに供給されます。
広範な実験により、PGIM が 2 つの古典的な MNER データセットに対して最先端の手法を上回り、より強力な堅牢性と一般化機能を示すことが示されました。

要約(オリジナル)

Multimodal Named Entity Recognition (MNER) on social media aims to enhance textual entity prediction by incorporating image-based clues. Existing studies mainly focus on maximizing the utilization of pertinent image information or incorporating external knowledge from explicit knowledge bases. However, these methods either neglect the necessity of providing the model with external knowledge, or encounter issues of high redundancy in the retrieved knowledge. In this paper, we present PGIM — a two-stage framework that aims to leverage ChatGPT as an implicit knowledge base and enable it to heuristically generate auxiliary knowledge for more efficient entity prediction. Specifically, PGIM contains a Multimodal Similar Example Awareness module that selects suitable examples from a small number of predefined artificial samples. These examples are then integrated into a formatted prompt template tailored to the MNER and guide ChatGPT to generate auxiliary refined knowledge. Finally, the acquired knowledge is integrated with the original text and fed into a downstream model for further processing. Extensive experiments show that PGIM outperforms state-of-the-art methods on two classic MNER datasets and exhibits a stronger robustness and generalization capability.

arxiv情報

著者	Jinyuan Li,Han Li,Zhuo Pan,Di Sun,Jiahao Wang,Wenkun Zhang,Gang Pan
発行日	2023-10-18 17:05:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー