Zero-shot Nuclei Detection via Visual-Language Pre-trained Models

要約

大規模視覚言語事前トレーニングモデル (VLPM) は、自然シーンの下流の物体検出において優れたパフォーマンスを発揮することが証明されています。
しかし、VLPM による H&E 画像上のゼロショット核検出はまだ研究されていません。
医療画像と、事前トレーニングに使用される Web 由来のテキストと画像のペアの間には大きなギャップがあるため、これは困難な作業になります。
この論文では、ゼロショット原子核検出のためのオブジェクトレベルの VLPM、グラウンデッド言語画像事前トレーニング (GLIP) モデルの可能性を探ることを試みます。
具体的には、自動プロンプト設計パイプラインは、VLPM と画像からテキストへの VLPM BLIP の関連付けバインディング特性に基づいて考案され、経験的な手動プロンプトエンジニアリングを回避します。
さらに、自動的に設計されたプロンプトを使用して自己トレーニングフレームワークを確立し、GLIP からの擬似ラベルとして予備結果を生成し、反復的に予測ボックスを改良します。
私たちの方法は、他の比較方法を上回る、ラベルフリーの核検出において顕著な性能を達成します。
何よりもまず、私たちの研究は、自然な画像とテキストのペアで事前トレーニングされた VLPM が、医療分野の下流タスクでも驚くべき可能性を示すことを実証しています。
コードは https://github.com/wuyongjianCODE/VLPMNuD で公開されます。

要約(オリジナル)

Large-scale visual-language pre-trained models (VLPM) have proven their excellent performance in downstream object detection for natural scenes. However, zero-shot nuclei detection on H\&E images via VLPMs remains underexplored. The large gap between medical images and the web-originated text-image pairs used for pre-training makes it a challenging task. In this paper, we attempt to explore the potential of the object-level VLPM, Grounded Language-Image Pre-training (GLIP) model, for zero-shot nuclei detection. Concretely, an automatic prompts design pipeline is devised based on the association binding trait of VLPM and the image-to-text VLPM BLIP, avoiding empirical manual prompts engineering. We further establish a self-training framework, using the automatically designed prompts to generate the preliminary results as pseudo labels from GLIP and refine the predicted boxes in an iterative manner. Our method achieves a remarkable performance for label-free nuclei detection, surpassing other comparison methods. Foremost, our work demonstrates that the VLPM pre-trained on natural image-text pairs exhibits astonishing potential for downstream tasks in the medical field as well. Code will be released at https://github.com/wuyongjianCODE/VLPMNuD.

arxiv情報

著者	Yongjian Wu,Yang Zhou,Jiya Saiyin,Bingzheng Wei,Maode Lai,Jianzhong Shou,Yubo Fan,Yan Xu
発行日	2023-06-30 13:44:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Zero-shot Nuclei Detection via Visual-Language Pre-trained Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー