InDRiVE: Intrinsic Disagreement based Reinforcement for Vehicle Exploration through Curiosity Driven Generalized World Model

要約

モデルベースの強化学習（MBRL）は、データの効率と堅牢性が重要な自律運転の有望なパラダイムとして浮上しています。
しかし、既存のソリューションは、多くの場合、慎重に作成されたタスク固有の外因性報酬に依存しており、一般化を新しいタスクや環境に制限しています。
この論文では、ドリーマーベースのMBRLフレームワーク内で純粋に固有の不一致に基づいた報酬を活用する方法である、インドリブ（車両探査のための固有の不一致に基づく強化）を提案します。
世界モデルのアンサンブルをトレーニングすることにより、エージェントは、タスク固有のフィードバックなしに環境の高い不確実性領域を積極的に調査します。
このアプローチは、タスクの不可知論の潜在的な表現をもたらし、速いゼロショットまたはレーンのフォローや衝突回避などの下流の運転タスクでの微調整が少ないことを可能にします。
見られた環境と目に見えない環境の両方での実験結果は、インドリブがより少ないトレーニングステップを使用しているにもかかわらず、dreamerv2およびdreamerv3ベースラインと比較して、より高い成功率と違反を達成することを示しています。
私たちの調査結果は、堅牢な車両制御行動を学習するための純粋に固有の探索の有効性を強調し、よりスケーラブルで適応性のある自律運転システムへの道を開いています。

要約(オリジナル)

Model-based Reinforcement Learning (MBRL) has emerged as a promising paradigm for autonomous driving, where data efficiency and robustness are critical. Yet, existing solutions often rely on carefully crafted, task specific extrinsic rewards, limiting generalization to new tasks or environments. In this paper, we propose InDRiVE (Intrinsic Disagreement based Reinforcement for Vehicle Exploration), a method that leverages purely intrinsic, disagreement based rewards within a Dreamer based MBRL framework. By training an ensemble of world models, the agent actively explores high uncertainty regions of environments without any task specific feedback. This approach yields a task agnostic latent representation, allowing for rapid zero shot or few shot fine tuning on downstream driving tasks such as lane following and collision avoidance. Experimental results in both seen and unseen environments demonstrate that InDRiVE achieves higher success rates and fewer infractions compared to DreamerV2 and DreamerV3 baselines despite using significantly fewer training steps. Our findings highlight the effectiveness of purely intrinsic exploration for learning robust vehicle control behaviors, paving the way for more scalable and adaptable autonomous driving systems.

arxiv情報

著者	Feeza Khan Khanzada,Jaerock Kwon
発行日	2025-03-07 16:56:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

InDRiVE: Intrinsic Disagreement based Reinforcement for Vehicle Exploration through Curiosity Driven Generalized World Model

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー