Large Language Model Informed Patent Image Retrieval

要約

特許出願においては、現在の特許画像と従来技術との類似点を特定するための画像ベースの検索システムが、特許出願の新規性と非自明性を確保するために極めて重要です。
近年その人気が高まっているにもかかわらず、既存の試みは、同じ特許内の画像を認識するのには効果的ではあるものの、関連する先行技術を検索する際の一般化可能性が限られているため、実用的な価値をもたらすことができません。
さらに、この作業には本質的に、特許画像の抽象的な視覚的特徴、画像分類の偏った分布、画像説明の意味情報によってもたらされる課題が伴います。
したがって、我々は、特許画像の特徴学習に対する言語情報に基づいた分布を意識したマルチモーダルなアプローチを提案します。これは、大規模言語モデルを統合することで特許画像の意味的理解を強化し、提案した分布を意識したコントラスト損失により過小評価されたクラスのパフォーマンスを向上させます。
DeepPatent2 データセットに関する広範な実験により、私たちの提案した方法が、mAP +53.3%、Recall@10 +41.8%、MRR@10 +51.9% という画像ベースの特許検索において最先端または同等のパフォーマンスを達成することが示されました。
さらに、詳細なユーザー分析を通じて、特許専門家の画像検索作業を支援するモデルを調査し、モデルの実世界への適用可能性と有効性を強調します。

要約(オリジナル)

In patent prosecution, image-based retrieval systems for identifying similarities between current patent images and prior art are pivotal to ensure the novelty and non-obviousness of patent applications. Despite their growing popularity in recent years, existing attempts, while effective at recognizing images within the same patent, fail to deliver practical value due to their limited generalizability in retrieving relevant prior art. Moreover, this task inherently involves the challenges posed by the abstract visual features of patent images, the skewed distribution of image classifications, and the semantic information of image descriptions. Therefore, we propose a language-informed, distribution-aware multimodal approach to patent image feature learning, which enriches the semantic understanding of patent image by integrating Large Language Models and improves the performance of underrepresented classes with our proposed distribution-aware contrastive losses. Extensive experiments on DeepPatent2 dataset show that our proposed method achieves state-of-the-art or comparable performance in image-based patent retrieval with mAP +53.3%, Recall@10 +41.8%, and MRR@10 +51.9%. Furthermore, through an in-depth user analysis, we explore our model in aiding patent professionals in their image retrieval efforts, highlighting the model’s real-world applicability and effectiveness.

arxiv情報

著者	Hao-Cheng Lo,Jung-Mei Chu,Jieh Hsiang,Chun-Chieh Cho
発行日	2024-04-30 08:45:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Large Language Model Informed Patent Image Retrieval

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー