PatFig: Generating Short and Long Captions for Patent Figures

要約

この論文では、11,000 件を超える欧州特許出願からの 30,000 件以上の特許図で構成される、新しい大規模特許図データセットである Qatent PatFig を紹介します。
このデータセットは、図ごとに、短いキャプションと長いキャプション、参照番号、それらに対応する用語、および画像のコンポーネント間の相互作用を説明する最小クレームセットを提供します。
データセットの使いやすさを評価するために、Qatent PatFig の LVLM モデルを微調整して短い説明と長い説明を生成し、特許図のキャプションプロセスの予測段階でさまざまなテキストベースの手がかりを組み込んだ効果を調査します。

要約(オリジナル)

This paper introduces Qatent PatFig, a novel large-scale patent figure dataset comprising 30,000+ patent figures from over 11,000 European patent applications. For each figure, this dataset provides short and long captions, reference numerals, their corresponding terms, and the minimal claim set that describes the interactions between the components of the image. To assess the usability of the dataset, we finetune an LVLM model on Qatent PatFig to generate short and long descriptions, and we investigate the effects of incorporating various text-based cues at the prediction stage of the patent figure captioning process.

arxiv情報

著者	Dana Aubakirova,Kim Gerdes,Lufei Liu
発行日	2023-09-15 13:10:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

PatFig: Generating Short and Long Captions for Patent Figures

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー