ReStory: VLM-augmentation of Social Human-Robot Interaction Datasets

要約

自然環境での自然なインタラクションデータの収集は時間がかかり、ロジスティック的に困難であるため、インターネット規模のデータセットはヒューマンロボットインタラクション (HRI) 研究者にとって贅沢です。
この問題は、ロボットのさまざまなフォームファクターと対話形式によってさらに悪化します。
HRI の分野におけるエスノメソドロジーおよび会話分析 (EMCA) に関する最近の研究に触発されて、私たちは、視覚言語モデルを活用して既存の野生環境での人間とロボットのインタラクションデータセットを拡張する可能性のある手法である ReStory を提案します。
ReStory は依然として人間の監督を必要としますが、人間が解釈できるインタラクションシナリオをストーリーボードの形式で合成できます。
私たちが提案したアプローチが、HRI 研究者やインタラクションデザイナーに貴重で希少なデータを活用するための新しい角度を提供することを願っています。

要約(オリジナル)

Internet-scaled datasets are a luxury for human-robot interaction (HRI) researchers, as collecting natural interaction data in the wild is time-consuming and logistically challenging. The problem is exacerbated by robots’ different form factors and interaction modalities. Inspired by recent work on ethnomethodological and conversation analysis (EMCA) in the domain of HRI, we propose ReStory, a method that has the potential to augment existing in-the-wild human-robot interaction datasets leveraging Vision Language Models. While still requiring human supervision, ReStory is capable of synthesizing human-interpretable interaction scenarios in the form of storyboards. We hope our proposed approach provides HRI researchers and interaction designers with a new angle to utilizing their valuable and scarce data.

arxiv情報

著者	Fanjun Bu,Wendy Ju
発行日	2024-12-30 09:38:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ReStory: VLM-augmentation of Social Human-Robot Interaction Datasets

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー