A Model-Agnostic Approach for Semantically Driven Disambiguation in Human-Robot Interaction

要約

特にロボットが大きな共有スペースでユーザーの指示に従う場合、曖昧さは人間とロボットの相互作用に避けられません。
たとえば、ユーザーがロボットに、指示不足のある家庭環境でオブジェクトを見つけるように要求した場合、オブジェクトは、欠落要因に応じて複数の場所にある可能性があります。
たとえば、ボウルはキッチンキャビネットまたはダイニングルームのテーブルにある場合があります。これは、きれいであるか汚れているか、完全か空っぽか、その周りの他のオブジェクトの存在に応じてです。
オブジェクト検索に関する以前の作業では、クエリオブジェクトがロボットにすぐに表示されるか、ワンショットの推論を使用してオブジェクトの位置を予測していると想定しています。
このペーパーでは、これらのギャップに焦点を当て、セマンティックに駆動された明確化を活用する新しいモデルに依存しないアプローチを提示して、より少ない試行でクエリオブジェクトを見つけるロボットの能力を高めます。
具体的には、さまざまな知識埋め込みモデルを活用し、あいまいさが生じたときに、反復的な予測プロセスに従う有益な説明方法を提案します。
私たちの方法のユーザー実験の評価は、私たちのアプローチがさまざまなカスタムセマンティックエンコーダーとLLMに適用できることを示しており、有益な明確化はパフォーマンスを改善し、ロボットが最初の試みでオブジェクトを見つけることができるようにします。
ユーザー実験データは、https://github.com/irmakdogan/expressiondatasetで公開されています。

要約(オリジナル)

Ambiguities are inevitable in human-robot interaction, especially when a robot follows user instructions in a large, shared space. For example, if a user asks the robot to find an object in a home environment with underspecified instructions, the object could be in multiple locations depending on missing factors. For instance, a bowl might be in the kitchen cabinet or on the dining room table, depending on whether it is clean or dirty, full or empty, and the presence of other objects around it. Previous works on object search have assumed that the queried object is immediately visible to the robot or have predicted object locations using one-shot inferences, which are likely to fail for ambiguous or partially understood instructions. This paper focuses on these gaps and presents a novel model-agnostic approach leveraging semantically driven clarifications to enhance the robot’s ability to locate queried objects in fewer attempts. Specifically, we leverage different knowledge embedding models, and when ambiguities arise, we propose an informative clarification method, which follows an iterative prediction process. The user experiment evaluation of our method shows that our approach is applicable to different custom semantic encoders as well as LLMs, and informative clarifications improve performances, enabling the robot to locate objects on its first attempts. The user experiment data is publicly available at https://github.com/IrmakDogan/ExpressionDataset.

arxiv情報

著者	Fethiye Irmak Dogan,Maithili Patel,Weiyu Liu,Iolanda Leite,Sonia Chernova
発行日	2025-04-02 13:51:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Model-Agnostic Approach for Semantically Driven Disambiguation in Human-Robot Interaction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー