Are Large Language Models Aligned with People’s Social Intuitions for Human-Robot Interactions?

要約

大規模言語モデル (LLM) は、ロボット工学、特に高レベルの行動計画においてますます使用されています。
一方、多くのロボット工学アプリケーションには人間の監督者や協力者が関与します。
したがって、LLM にとって、人々の好みや価値観に沿った社会的に受け入れられる行動を生み出すことが重要です。
この研究では、LLM がヒューマンロボットインタラクション (HRI) シナリオにおける行動判断とコミュニケーションの好みに関する人々の直観を捕捉するかどうかをテストします。
評価のために、3 つの HRI ユーザー調査を再現し、LLM の出力と実際の参加者の出力を比較します。
GPT-4 は他のモデルよりも優れたパフォーマンスを示し、2 つの研究 $\unicode{x2014}$ でユーザーの回答と強く相関する回答を生成することがわかりました。最初の研究は、さまざまな状況でロボットに最適なコミュニケーション行為を選択することを扱ったものです ($r_s
$ = 0.82)、2 番目は行動の望ましさ、意図性、意外性を判断します ($r_s$ = 0.83)。
しかし、人間がロボットと人間の行動を異なる方法で判断するかどうかをテストした最後の研究では、強い相関関係を達成したモデルはありませんでした。
さらに、視覚モデルはビデオ刺激の本質を捉えることができず、LLM はさまざまなコミュニケーション行為や行動の望ましさを人よりも高く評価する傾向があることを示します。

要約(オリジナル)

Large language models (LLMs) are increasingly used in robotics, especially for high-level action planning. Meanwhile, many robotics applications involve human supervisors or collaborators. Hence, it is crucial for LLMs to generate socially acceptable actions that align with people’s preferences and values. In this work, we test whether LLMs capture people’s intuitions about behavior judgments and communication preferences in human-robot interaction (HRI) scenarios. For evaluation, we reproduce three HRI user studies, comparing the output of LLMs with that of real participants. We find that GPT-4 strongly outperforms other models, generating answers that correlate strongly with users’ answers in two studies $\unicode{x2014}$ the first study dealing with selecting the most appropriate communicative act for a robot in various situations ($r_s$ = 0.82), and the second with judging the desirability, intentionality, and surprisingness of behavior ($r_s$ = 0.83). However, for the last study, testing whether people judge the behavior of robots and humans differently, no model achieves strong correlations. Moreover, we show that vision models fail to capture the essence of video stimuli and that LLMs tend to rate different communicative acts and behavior desirability higher than people.

arxiv情報

著者	Lennart Wachowiak,Andrew Coles,Oya Celiktutan,Gerard Canal
発行日	2024-07-09 11:27:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Are Large Language Models Aligned with People’s Social Intuitions for Human-Robot Interactions?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー