From Decision to Action in Surgical Autonomy: Multi-Modal Large Language Models for Robot-Assisted Blood Suction

要約

大規模言語モデル (LLM) の台頭は、ロボット工学と自動化の研究に影響を与えています。
一般的なロボットタスクへの LLM の統合は進んでいますが、推論、説明可能性、安全性などの重要な要素が最重要である手術などのより特殊な領域での LLM の導入には、依然として顕著な空白が残っています。
環境の変化を推論して適応する能力を必要とするロボット手術における自律性の実現は、依然として大きな課題です。
この研究では、自律的な血液吸引のためのロボット支援手術におけるマルチモーダル LLM 統合を提案します。
推論と優先順位付けは上位レベルのタスク計画 LLM に委任され、動作計画と実行は下位レベルの深層強化学習モデルによって処理され、2 つのコンポーネント間に分散機関が作成されます。
外科手術は非常に動的であり、予期せぬ状況に遭遇する可能性があるため、意思決定に影響を与えるために血栓や活動性出血が導入されました。
その結果、マルチモーダル LLM を上位レベルの推論ユニットとして使用すると、これらの手術の複雑さを考慮して、ロボット支援手術では以前は達成できなかったレベルの推論を達成できることが示されました。
これらの発見は、マルチモーダル LLM がロボット支援手術における状況理解と意思決定を大幅に強化し、自律手術システムへの一歩となる可能性を示しています。

要約(オリジナル)

The rise of Large Language Models (LLMs) has impacted research in robotics and automation. While progress has been made in integrating LLMs into general robotics tasks, a noticeable void persists in their adoption in more specific domains such as surgery, where critical factors such as reasoning, explainability, and safety are paramount. Achieving autonomy in robotic surgery, which entails the ability to reason and adapt to changes in the environment, remains a significant challenge. In this work, we propose a multi-modal LLM integration in robot-assisted surgery for autonomous blood suction. The reasoning and prioritization are delegated to the higher-level task-planning LLM, and the motion planning and execution are handled by the lower-level deep reinforcement learning model, creating a distributed agency between the two components. As surgical operations are highly dynamic and may encounter unforeseen circumstances, blood clots and active bleeding were introduced to influence decision-making. Results showed that using a multi-modal LLM as a higher-level reasoning unit can account for these surgical complexities to achieve a level of reasoning previously unattainable in robot-assisted surgeries. These findings demonstrate the potential of multi-modal LLMs to significantly enhance contextual understanding and decision-making in robotic-assisted surgeries, marking a step toward autonomous surgical systems.

arxiv情報

著者	Sadra Zargarzadeh,Maryam Mirzaei,Yafei Ou,Mahdi Tavakoli
発行日	2024-08-14 20:30:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

From Decision to Action in Surgical Autonomy: Multi-Modal Large Language Models for Robot-Assisted Blood Suction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー