Robi Butler: Remote Multimodal Interactions with Household Robot Assistant

要約

この論文では、遠隔ユーザーとのマルチモーダルな対話を可能にする新しい家庭用ロボットシステムである Robi Butler を紹介します。
高度な通信インターフェイスを基盤とする Robi Butler を使用すると、ユーザーはロボットのステータスを監視し、テキストまたは音声で指示を送信し、手で指示してターゲットオブジェクトを選択できます。
私たちのシステムの中核には、大規模言語モデル (LLM) を利用した高レベルの動作モジュールがあり、マルチモーダルな命令を解釈してアクションプランを生成します。
これらのプランは、テキストクエリとポインティングクエリの両方を処理するビジョン言語モデル (VLM) によってサポートされるオープンボキャブラリプリミティブのセットで構成されています。
上記のコンポーネントを統合することで、Robi Butler は現実世界の家庭環境でリモートのマルチモーダル命令をゼロショット方式で実行できるようになります。
私たちは、リモートユーザーがマルチモーダルな指示を与えることを伴うさまざまな日常の家事を使用して、このシステムの有効性と効率性を実証します。
さらに、マルチモーダルなインタラクションが人間とロボットのリモートインタラクション中に効率とユーザーエクスペリエンスにどのような影響を与えるかを分析し、潜在的な改善点について議論するためにユーザー調査を実施しました。

要約(オリジナル)

In this paper, we introduce Robi Butler, a novel household robotic system that enables multimodal interactions with remote users. Building on the advanced communication interfaces, Robi Butler allows users to monitor the robot’s status, send text or voice instructions, and select target objects by hand pointing. At the core of our system is a high-level behavior module, powered by Large Language Models (LLMs), that interprets multimodal instructions to generate action plans. These plans are composed of a set of open vocabulary primitives supported by Vision Language Models (VLMs) that handle both text and pointing queries. The integration of the above components allows Robi Butler to ground remote multimodal instructions in the real-world home environment in a zero-shot manner. We demonstrate the effectiveness and efficiency of this system using a variety of daily household tasks that involve remote users giving multimodal instructions. Additionally, we conducted a user study to analyze how multimodal interactions affect efficiency and user experience during remote human-robot interaction and discuss the potential improvements.

arxiv情報

著者	Anxing Xiao,Nuwan Janaka,Tianrun Hu,Anshul Gupta,Kaixin Li,Cunjun Yu,David Hsu
発行日	2024-09-30 17:49:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Robi Butler: Remote Multimodal Interactions with Household Robot Assistant

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー