MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

要約

大規模言語モデル (LLM) は、さまざまな自然言語処理 (NLP) タスクにおいて優れたパフォーマンスを実証しています。
ただし、LLM が特定のドメイン (例: 知的財産 (IP) ドメイン) でどの程度うまく機能するかについての理解は限られています。
この論文では、IP ドメインにおける LLM の評価のための新しいベンチマークである、知的財産に関する最初の多言語指向のクイズ (MoZIP) を提供します。
MoZIP ベンチマークには、IP 多肢選択クイズ (IPQuiz)、IP 質問応答 (IPQA)、および特許照合 (PatentMatch) という 3 つの難しいタスクが含まれています。
さらに、新しい IP 指向の多言語大規模言語モデル (MoZi と呼ばれます) も開発します。これは、多言語 IP 関連のテキストデータで微調整された BLOOMZ ベースのモデルです。
提案した MoZi モデルと 4 つのよく知られた LLM (BLOOMZ、BELLE、ChatGLM、ChatGPT) を MoZIP ベンチマークで評価します。
実験結果は、MoZi が ChatGPT と比較するとスコアが低い一方で、BLOOMZ、BELLE、ChatGLM よりも顕著な差でパフォーマンスを上回っていることを示しています。
特に、MoZIP ベンチマークにおける現在の LLM のパフォーマンスには改善の余地が多く、最も強力な ChatGPT でさえ合格レベルに達していません。
私たちのソースコード、データ、モデルは \url{https://github.com/AI-for-Science/MoZi} で入手できます。

要約(オリジナル)

Large language models (LLMs) have demonstrated impressive performance in various natural language processing (NLP) tasks. However, there is limited understanding of how well LLMs perform in specific domains (e.g, the intellectual property (IP) domain). In this paper, we contribute a new benchmark, the first Multilingual-oriented quiZ on Intellectual Property (MoZIP), for the evaluation of LLMs in the IP domain. The MoZIP benchmark includes three challenging tasks: IP multiple-choice quiz (IPQuiz), IP question answering (IPQA), and patent matching (PatentMatch). In addition, we also develop a new IP-oriented multilingual large language model (called MoZi), which is a BLOOMZ-based model that has been supervised fine-tuned with multilingual IP-related text data. We evaluate our proposed MoZi model and four well-known LLMs (i.e., BLOOMZ, BELLE, ChatGLM and ChatGPT) on the MoZIP benchmark. Experimental results demonstrate that MoZi outperforms BLOOMZ, BELLE and ChatGLM by a noticeable margin, while it had lower scores compared with ChatGPT. Notably, the performance of current LLMs on the MoZIP benchmark has much room for improvement, and even the most powerful ChatGPT does not reach the passing level. Our source code, data, and models are available at \url{https://github.com/AI-for-Science/MoZi}.

arxiv情報

著者	Shiwen Ni,Minghuan Tan,Yuelin Bai,Fuqiang Niu,Min Yang,Bowen Zhang,Ruifeng Xu,Xiaojun Chen,Chengming Li,Xiping Hu,Ye Li,Jianping Fan
発行日	2024-02-26 08:27:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー