Explainable, Multi-modal Wound Infection Classification from Images Augmented with Generated Captions

要約

糖尿病の足潰瘍（DFU）の感染症は、組織死や四肢切断を含む重度の合併症を引き起こす可能性があり、正確でタイムリーな診断の必要性を強調します。
以前の機械学習方法は、医療ノートなどの追加のメタデータを利用せずに、創傷画像のみを分析することにより、感染を特定することに焦点を合わせてきました。
この研究では、DFU画像を拡張するために合成テキストの説明を活用する新しい深い学習フレームワークである創傷感染検出（ScarWID）の合成キャプション拡張検索を導入することにより、感染の検出を改善することを目指しています。
ScarWidは、2つのコンポーネントで構成されています。（1）創傷ブリップ、GPT-4O生成記述で微調整された視覚言語モデル（VLM）は、画像からの一貫したキャプションを合成します。
（2）分析を使用して、画像とその対応する創傷ブリップキャプションから交差モーダル埋め込みを抽出する画像テキスト融合モジュール。
感染状態は、ラベル付きサポートセットからTop-K同様のアイテムを取得することにより決定されます。
トレーニングデータの多様性を強化するために、潜在的な拡散モデルを利用して追加の創傷画像を生成しました。
その結果、ScarWidは最先端のモデルを上回り、創傷感染分類のためにそれぞれ0.85、0.78、および0.81の平均感度、特異性、および精度を達成しました。
創造されたキャプションを創傷画像と感染の検出結果に沿って表示すると、解釈可能性と信頼が向上し、看護師がスカーウィッドアウトプットを医学知識に合わせることができます。
これは、創傷ノートが利用できない場合、または創傷感染の視覚的属性を特定するのが難しいと感じる初心者の看護師を支援する場合、特に価値があります。

要約(オリジナル)

Infections in Diabetic Foot Ulcers (DFUs) can cause severe complications, including tissue death and limb amputation, highlighting the need for accurate, timely diagnosis. Previous machine learning methods have focused on identifying infections by analyzing wound images alone, without utilizing additional metadata such as medical notes. In this study, we aim to improve infection detection by introducing Synthetic Caption Augmented Retrieval for Wound Infection Detection (SCARWID), a novel deep learning framework that leverages synthetic textual descriptions to augment DFU images. SCARWID consists of two components: (1) Wound-BLIP, a Vision-Language Model (VLM) fine-tuned on GPT-4o-generated descriptions to synthesize consistent captions from images; and (2) an Image-Text Fusion module that uses cross-attention to extract cross-modal embeddings from an image and its corresponding Wound-BLIP caption. Infection status is determined by retrieving the top-k similar items from a labeled support set. To enhance the diversity of training data, we utilized a latent diffusion model to generate additional wound images. As a result, SCARWID outperformed state-of-the-art models, achieving average sensitivity, specificity, and accuracy of 0.85, 0.78, and 0.81, respectively, for wound infection classification. Displaying the generated captions alongside the wound images and infection detection results enhances interpretability and trust, enabling nurses to align SCARWID outputs with their medical knowledge. This is particularly valuable when wound notes are unavailable or when assisting novice nurses who may find it difficult to identify visual attributes of wound infection.

arxiv情報

著者	Palawat Busaranuvong,Emmanuel Agu,Reza Saadati Fard,Deepak Kumar,Shefalika Gautam,Bengisu Tulu,Diane Strong
発行日	2025-02-27 17:04:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Explainable, Multi-modal Wound Infection Classification from Images Augmented with Generated Captions

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー