Multi-Class Abnormality Classification in Video Capsule Endoscopy Using Deep Learning

要約

このレポートでは、ビデオカプセル内視鏡フレームのマルチクラス異常分類に畳み込みニューラルネットワーク (CNN) とトランスフォーマーベースのアーキテクチャのアンサンブルを活用した、カプセルビジョン 2024 チャレンジに対する Team Seq2Cure の深層学習アプローチの概要を説明します。
データセットは、3 つの公開ソースと 1 つの非公開データセットからの 50,000 を超えるフレームで構成され、10 の異常クラスにわたってラベル付けされています。
グローバルコンテキストをキャプチャする際の従来の CNN の制限を克服するために、CNN とトランスフォーマーモデルをマルチモデルアンサンブル内に統合しました。
私たちのアプローチは、検証セットで 86.34 パーセントのバランスの取れた精度と 0.9908 の平均 AUC-ROC スコアを達成し、複雑な異常の分類が大幅に向上しました。
コードは http://github.com/arnavs04/capsule-vision-2024 で入手できます。

要約(オリジナル)

This report outlines Team Seq2Cure’s deep learning approach for the Capsule Vision 2024 Challenge, leveraging an ensemble of convolutional neural networks (CNNs) and transformer-based architectures for multi-class abnormality classification in video capsule endoscopy frames. The dataset comprised over 50,000 frames from three public sources and one private dataset, labeled across 10 abnormality classes. To overcome the limitations of traditional CNNs in capturing global context, we integrated CNN and transformer models within a multi-model ensemble. Our approach achieved a balanced accuracy of 86.34 percent and a mean AUC-ROC score of 0.9908 on the validation set, with significant improvements in classifying complex abnormalities. Code is available at http://github.com/arnavs04/capsule-vision-2024 .

arxiv情報

著者	Arnav Samal,Ranya
発行日	2024-10-24 16:13:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multi-Class Abnormality Classification in Video Capsule Endoscopy Using Deep Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー