SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning

要約

Vision-Language-activeモデル（VLA）は、一般主義ロボットポリシーとしての可能性を示しています。
ただし、これらのモデルは、環境、ロボット自体、および人間への害のリスクなど、実際の展開中に極端な安全性の課題をもたらします。
安全上の制約は、どのようにしてVLAに明示的に統合できますか？
統合された安全性アプローチ（ISA）を調査し、安全要件を体系的にモデル化し、多様な安全でない行動を積極的に引き出し、安全な強化学習を通じてVLAポリシーを効果的に制約し、ターゲット評価を通じて安全性を厳密に保証することにより、これに対処します。
制約されたマルコフ決定プロセス（CMDP）パラダイムを活用すると、ISAはMIN-MAXの観点からVLAを誘発された安全リスクに対して最適化します。
したがって、この包括的なアプローチを通じて整合したポリシーは、次の重要な機能を達成します。（i）効果的な安全性パフォーマンスのトレードオフでは、この探索は、現在の最先端の方法と比較して83.58％の安全改善をもたらし、タスクのパフォーマンスを維持します（+3.85％）。
（ii）長い尾のリスクを軽減し、極端な故障シナリオを処理する能力を備えた強力な安全保証。
（iii）さまざまな分散型摂動に対する学習された安全行動の堅牢な一般化。
当社のデータ、モデル、新たに提案されたベンチマーク環境は、https：//pku-safevla.github.ioで入手できます。

要約(オリジナル)

Vision-language-action models (VLAs) show potential as generalist robot policies. However, these models pose extreme safety challenges during real-world deployment, including the risk of harm to the environment, the robot itself, and humans. How can safety constraints be explicitly integrated into VLAs? We address this by exploring an integrated safety approach (ISA), systematically modeling safety requirements, then actively eliciting diverse unsafe behaviors, effectively constraining VLA policies via safe reinforcement learning, and rigorously assuring their safety through targeted evaluations. Leveraging the constrained Markov decision process (CMDP) paradigm, ISA optimizes VLAs from a min-max perspective against elicited safety risks. Thus, policies aligned through this comprehensive approach achieve the following key features: (I) effective safety-performance trade-offs, this exploration yields an 83.58% safety improvement compared to the current state-of-the-art method, while also maintaining task performance (+3.85%). (II) strong safety assurance, with the ability to mitigate long-tail risks and handle extreme failure scenarios. (III) robust generalization of learned safety behaviors to various out-of-distribution perturbations. Our data, models and newly proposed benchmark environment are available at https://pku-safevla.github.io.

arxiv情報

著者	Borong Zhang,Yuhao Zhang,Jiaming Ji,Yingshan Lei,Josef Dai,Yuanpei Chen,Yaodong Yang
発行日	2025-05-31 14:22:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー