Learning Multimodal Confidence for Intention Recognition in Human-Robot Interaction

要約

協働ロボット工学の急速な発展により、ロボットが特定の意図に従って動作することが可能となり、日常生活に困難を抱える高齢者を支援する新たな可能性がもたらされました。
ただし、人間とロボットの効率的な協力には、共有環境における自然で正確かつ信頼性の高い意図認識が必要です。
これに対する現在の最大の課題は、認識されるべきマルチモーダル融合意図の不確実性を軽減し、現在のインタラクティブな状態にもかかわらず、より信頼性の高い結果を適応的に推論することです。
この研究では、新しい学習ベースのマルチモーダル融合フレームワーク Batch Multimodal Confidence Learning for Opinion Pool (BMCLOP) を提案します。
私たちのアプローチは、ベイジアンマルチモーダル融合法とバッチ信頼学習アルゴリズムを組み合わせて、インタラクティブな条件下での精度、不確実性の低減、および成功率を向上させます。
特に、一般的で実用的なマルチモーダル意図認識フレームワークは、さらに簡単に拡張できます。
私たちが望む支援シナリオでは、ジェスチャー、音声、視線という 3 つのモダリティが考慮されており、これらはすべて、すべての有限な意図にわたってカテゴリカルな分布を生成します。
提案された方法は、広範な実験を通じて 6 自由度ロボットで検証され、ベースラインと比較して高いパフォーマンスを示します。

要約(オリジナル)

The rapid development of collaborative robotics has provided a new possibility of helping the elderly who has difficulties in daily life, allowing robots to operate according to specific intentions. However, efficient human-robot cooperation requires natural, accurate and reliable intention recognition in shared environments. The current paramount challenge for this is reducing the uncertainty of multimodal fused intention to be recognized and reasoning adaptively a more reliable result despite current interactive condition. In this work we propose a novel learning-based multimodal fusion framework Batch Multimodal Confidence Learning for Opinion Pool (BMCLOP). Our approach combines Bayesian multimodal fusion method and batch confidence learning algorithm to improve accuracy, uncertainty reduction and success rate given the interactive condition. In particular, the generic and practical multimodal intention recognition framework can be easily extended further. Our desired assistive scenarios consider three modalities gestures, speech and gaze, all of which produce categorical distributions over all the finite intentions. The proposed method is validated with a six-DoF robot through extensive experiments and exhibits high performance compared to baselines.

arxiv情報

著者	Xiyuan Zhao,Huijun Li,Tianyuan Miao,Xianyi Zhu,Zhikai Wei,Aiguo Song
発行日	2024-05-23 02:43:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Learning Multimodal Confidence for Intention Recognition in Human-Robot Interaction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー