BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications

要約

内視鏡手術は 2 次元のビューに依存しているため、外科医にとっては深さの認識と器具の操作が課題となっています。
これらの制限に対処する有望なソリューションとして同時位置特定とマッピング (SLAM) が浮上していますが、内視鏡処置での実装には、単眼カメラの使用やオドメトリセンサーの欠如などのハードウェア制限により、重大な課題が生じています。
この研究では、最先端のモデルと新しく開発されたモデルを組み合わせた、堅牢な深層学習ベースの SLAM アプローチを紹介します。
これは、CycleGAN アーキテクチャに基づいた新しい教師なし手法を導入する単眼姿勢推定モジュール、新しい Zoe アーキテクチャを活用する単眼深度推定モジュール、および以前のモデルからの情報を使用してデータを作成する 3D 再構成モジュールの 3 つの主要な部分で構成されます。
一貫した手術マップ。
この手順のパフォーマンスは、公的に利用可能な 3 つのデータセット (Hamlyn、EndoSLAM、および SCARED) を使用して厳密に評価され、EndoSFMLearner と EndoDepth という 2 つの最先端のメソッドに対してベンチマークされました。
MDEM への Zoe の統合は、内視鏡検査における最先端の深さ推定アルゴリズムと比較して優れたパフォーマンスを実証しましたが、MPEM の新しいアプローチは競合するパフォーマンスと最短の推論時間を示しました。
この結果は、内視鏡手術における 3 つの異なるシナリオである腹腔鏡検査、胃カメラ検査、結腸鏡検査における当社のアプローチの堅牢性を示しています。
提案された SLAM アプローチは、外科医に強化された深さ認識と 3D 再構成機能を提供することで、内視鏡手術の精度と効率を向上させる可能性があります。

要約(オリジナル)

Endoscopic surgery relies on two-dimensional views, posing challenges for surgeons in depth perception and instrument manipulation. While Simultaneous Localization and Mapping (SLAM) has emerged as a promising solution to address these limitations, its implementation in endoscopic procedures presents significant challenges due to hardware limitations, such as the use of a monocular camera and the absence of odometry sensors. This study presents a robust deep learning-based SLAM approach that combines state-of-the-art and newly developed models. It consists of three main parts: the Monocular Pose Estimation Module that introduces a novel unsupervised method based on the CycleGAN architecture, the Monocular Depth Estimation Module that leverages the novel Zoe architecture, and the 3D Reconstruction Module which uses information from the previous models to create a coherent surgical map. The performance of the procedure was rigorously evaluated using three publicly available datasets (Hamlyn, EndoSLAM, and SCARED) and benchmarked against two state-of-the-art methods, EndoSFMLearner and EndoDepth. The integration of Zoe in the MDEM demonstrated superior performance compared to state-of-the-art depth estimation algorithms in endoscopy, whereas the novel approach in the MPEM exhibited competitive performance and the lowest inference time. The results showcase the robustness of our approach in laparoscopy, gastroscopy, and colonoscopy, three different scenarios in endoscopic surgery. The proposed SLAM approach has the potential to improve the accuracy and efficiency of endoscopic procedures by providing surgeons with enhanced depth perception and 3D reconstruction capabilities.

arxiv情報

著者	G. Manni,C. Lauretti,F. Prata,R. Papalia,L. Zollo,P. Soda
発行日	2024-08-06 10:13:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー