Generating Robot Constitutions & Benchmarks for Semantic Safety

要約

最近まで、ロボットの安全研究は、主に衝突回避とロボットのすぐ近くの危険の減少についてでした。
大規模なビジョンモデル（VLM）の出現以来、ロボットは現在、人間との高レベルのセマンティックシーンの理解と自然言語の相互作用も可能になりました。
既知の脆弱性（例：幻覚や刑務所破壊）にもかかわらず、VLMは現実の世界と物理的に接触できるロボットの制御を手渡されています。
これは危険な行動につながる可能性があり、ロボットのセマンティックな安全性を即座に懸念します。
この論文の貢献は2倍です。まず、これらの新たなリスクに対処するために、ロボット脳として機能する基礎モデルの意味安全性を評価および改善するためのデータセットの大規模で包括的なコレクションであるAsimov Benchmarkをリリースします。
データ生成レシピは非常にスケーラブルです。テキストと画像生成のテクニックを活用することにより、実際の視覚シーンから望ましくない状況を生成し、病院からの人間の負傷報告を生成します。
第二に、実世界のデータからロボット憲法を自動的に生成して、憲法上のAIメカニズムを使用してロボットの動作を操作するフレームワークを開発します。
書面による行動ルールにニュアンスを導入できる新しい自動改めのプロセスを提案します。
これは、行動の望ましさと安全性に関する人間の好みとの整合性の増加につながる可能性があります。
さまざまな長さの多様な憲法のセットにわたる一般性と特異性の間のトレードオフを調査し、ロボットが違憲行為を効果的に拒否できることを実証します。
生成された憲法を使用して、アシモフのベンチマークで84.3％の最高位置合わせ速度を測定し、憲法なしのベースラインと人間が書いた憲法を上回ります。
データはAsimov-benchmark.github.ioで入手できます

要約(オリジナル)

Until recently, robotics safety research was predominantly about collision avoidance and hazard reduction in the immediate vicinity of a robot. Since the advent of large vision and language models (VLMs), robots are now also capable of higher-level semantic scene understanding and natural language interactions with humans. Despite their known vulnerabilities (e.g. hallucinations or jail-breaking), VLMs are being handed control of robots capable of physical contact with the real world. This can lead to dangerous behaviors, making semantic safety for robots a matter of immediate concern. Our contributions in this paper are two fold: first, to address these emerging risks, we release the ASIMOV Benchmark, a large-scale and comprehensive collection of datasets for evaluating and improving semantic safety of foundation models serving as robot brains. Our data generation recipe is highly scalable: by leveraging text and image generation techniques, we generate undesirable situations from real-world visual scenes and human injury reports from hospitals. Secondly, we develop a framework to automatically generate robot constitutions from real-world data to steer a robot’s behavior using Constitutional AI mechanisms. We propose a novel auto-amending process that is able to introduce nuances in written rules of behavior; this can lead to increased alignment with human preferences on behavior desirability and safety. We explore trade-offs between generality and specificity across a diverse set of constitutions of different lengths, and demonstrate that a robot is able to effectively reject unconstitutional actions. We measure a top alignment rate of 84.3% on the ASIMOV Benchmark using generated constitutions, outperforming no-constitution baselines and human-written constitutions. Data is available at asimov-benchmark.github.io

arxiv情報

著者	Pierre Sermanet,Anirudha Majumdar,Alex Irpan,Dmitry Kalashnikov,Vikas Sindhwani
発行日	2025-03-11 17:50:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Generating Robot Constitutions & Benchmarks for Semantic Safety

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー