MediSee: Reasoning-based Pixel-level Perception in Medical Images

要約

ピクセルレベルの医療画像認識の顕著な進歩にもかかわらず、既存の方法は特定のタスクに限定されるか、入力プロンプトとして正確な境界ボックスまたはテキストラベルに大きく依存しています。
ただし、入力に必要な医学的知識は、一般の人々にとって大きな障害であり、これらの方法の普遍性を大幅に減らします。
これらのドメイン特異的な補助情報と比較して、一般ユーザーは論理的推論を必要とする口頭クエリに依存する傾向があります。
このホワイトペーパーでは、新しい医学的ビジョンタスク：医療推論セグメンテーションと検出（MEDSD）を紹介します。これは、医療画像に関する暗黙のクエリを理解し、対応するセグメンテーションマスクとターゲットオブジェクトの境界ボックスを生成することを目的としています。
このタスクを達成するために、最初に、対応する推論とともに、医療機関のターゲットの実質的なコレクションを含む、多面的で論理駆動型の医療推論セグメンテーションおよび検出（MLMR-SD）データセットを紹介します。
さらに、医療推論のセグメンテーションと検出のために設計された効果的なベースラインモデルであるMediseeを提案します。
実験結果は、提案された方法が暗黙の口語クエリを使用してMEDSDに効果的に対処し、従来の医療紹介セグメンテーション方法を上回ることができることを示しています。

要約(オリジナル)

Despite remarkable advancements in pixel-level medical image perception, existing methods are either limited to specific tasks or heavily rely on accurate bounding boxes or text labels as input prompts. However, the medical knowledge required for input is a huge obstacle for general public, which greatly reduces the universality of these methods. Compared with these domain-specialized auxiliary information, general users tend to rely on oral queries that require logical reasoning. In this paper, we introduce a novel medical vision task: Medical Reasoning Segmentation and Detection (MedSD), which aims to comprehend implicit queries about medical images and generate the corresponding segmentation mask and bounding box for the target object. To accomplish this task, we first introduce a Multi-perspective, Logic-driven Medical Reasoning Segmentation and Detection (MLMR-SD) dataset, which encompasses a substantial collection of medical entity targets along with their corresponding reasoning. Furthermore, we propose MediSee, an effective baseline model designed for medical reasoning segmentation and detection. The experimental results indicate that the proposed method can effectively address MedSD with implicit colloquial queries and outperform traditional medical referring segmentation methods.

arxiv情報

著者	Qinyue Tong,Ziqian Lu,Jun Liu,Yangming Zheng,Zheming Lu
発行日	2025-04-23 15:29:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MediSee: Reasoning-based Pixel-level Perception in Medical Images

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー