Multimodal Fusion of EMG and Vision for Human Grasp Intent Inference in Prosthetic Hand Control

要約

目的: 下腕切断者にとって、ロボット義手は日常生活活動を行う能力を取り戻すことを約束します。
筋電図検査 (EMG) などの生理学的信号に基づく現在の制御方法は、動作アーチファクトや筋肉疲労などにより、不十分な推論結果が得られる傾向があります。
視覚センサーは環境の状態に関する主要な情報源であり、実行可能な意図されたジェスチャーを推測する上で重要な役割を果たします。
ただし、視覚的証拠は、オブジェクトのオクルージョンや照明の変化などによる、それ自体のアーティファクトの影響も受けやすいです。生理学的および視覚センサーの測定を使用したマルチモーダル証拠の融合は、これらのモダリティの補完的な強みにより自然なアプローチです。
方法：この論文では、ニューラルネットワークモデルによって処理された前腕からの視点ビデオ、視線、EMGを使用した把握意図推論のためのベイズ証拠融合フレームワークを紹介します。
私たちは、物体を掴むために手が物体に近づくときの、個別のパフォーマンスと融合されたパフォーマンスを時間の関数として分析します。
この目的のために、ニューラルネットワークコンポーネントをトレーニングするための新しいデータ処理および拡張技術も開発しました。
結果: 私たちの結果は、平均して、フュージョンにより、到達段階にある瞬間の今後の把握タイプの分類精度が、EMG および視覚的証拠と個別に比較して 13.66% および 14.8% 向上し、全体のフュージョン精度が 95.3% になることを示しています。
結論: 私たちの実験データ分析は、EMG と視覚的証拠が相補的な長所を示し、その結果、マルチモーダルな証拠の融合が、いつでも個々の証拠モダリティを上回るパフォーマンスを発揮できることを示しています。

要約(オリジナル)

Objective: For lower arm amputees, robotic prosthetic hands promise to regain the capability to perform daily living activities. Current control methods based on physiological signals such as electromyography (EMG) are prone to yielding poor inference outcomes due to motion artifacts, muscle fatigue, and many more. Vision sensors are a major source of information about the environment state and can play a vital role in inferring feasible and intended gestures. However, visual evidence is also susceptible to its own artifacts, most often due to object occlusion, lighting changes, etc. Multimodal evidence fusion using physiological and vision sensor measurements is a natural approach due to the complementary strengths of these modalities. Methods: In this paper, we present a Bayesian evidence fusion framework for grasp intent inference using eye-view video, eye-gaze, and EMG from the forearm processed by neural network models. We analyze individual and fused performance as a function of time as the hand approaches the object to grasp it. For this purpose, we have also developed novel data processing and augmentation techniques to train neural network components. Results: Our results indicate that, on average, fusion improves the instantaneous upcoming grasp type classification accuracy while in the reaching phase by 13.66% and 14.8%, relative to EMG and visual evidence individually, resulting in an overall fusion accuracy of 95.3%. Conclusion: Our experimental data analyses demonstrate that EMG and visual evidence show complementary strengths, and as a consequence, fusion of multimodal evidence can outperform each individual evidence modality at any given time.

arxiv情報

著者	Mehrshad Zandigohar,Mo Han,Mohammadreza Sharif,Sezen Yagmur Gunay,Mariusz P. Furmanek,Mathew Yarossi,Paolo Bonato,Cagdas Onal,Taskin Padir,Deniz Erdogmus,Gunar Schirner
発行日	2023-10-05 21:26:48+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multimodal Fusion of EMG and Vision for Human Grasp Intent Inference in Prosthetic Hand Control

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー