Simulator Ensembles for Trustworthy Autonomous Driving Testing

要約

運転シミュレータによるシナリオベースのテストは、自動運転支援システム（ADA）の故障条件を特定し、フィールド内の道路試験の量を減らすために広く使用されています。
ただし、既存の研究では、同じものと同様に異なるシミュレータでの繰り返しのテスト実行が異なる結果をもたらすことが示されています。これは、他の要因の中でも特に、物理学の異なる実装の原因に起因する可能性があります。
このホワイトペーパーでは、MultiSIMを紹介します。これは、シミュレーターのアンサンブルを活用して障害誘発性のシミュレーターと存在するテストシナリオを特定する検索ベースのテストアプローチに基づいた多サイムシミュレーションADASテストへの新しいアプローチです。
検索中、各シナリオは複数のシミュレータで共同で評価されます。
シミュレーター全体で一貫した結果を生成するシナリオは、さらなる調査のために優先順位を付けられますが、シミュレーターのサブセットのみで失敗するシナリオは、一般化可能な障害ではなくシミュレーター固有の問題を反映する可能性があるため、優先度が低くなります。
広く使用されている3つのシミュレータの異なるペアで深いニューラルネットワークベースのADAをテストすることを含む私たちのケーススタディは、マルチIMが平均してシミュレーターと存在する障害の割合が51％を達成することにより、単一シミュレーターテストを上回ることを示しています。
さまざまなシミュレーターで得られた独立したテスト生成キャンペーンの結果を組み合わせた最先端のマルチシミュレーターアプローチと比較して、マルチIMは、同等の妥当性率を示しながら、54％のシミュレーターに依存しない障害障害テストを識別します。
シミュレーターの意見の不一致を予測し、バイパス実行を予測するためにサロゲートモデルを活用するマルチシムの強化は、有効な障害の平均数を増やすだけでなく、最初の有効な障害を見つける効率を向上させます。

要約(オリジナル)

Scenario-based testing with driving simulators is extensively used to identify failing conditions of automated driving assistance systems (ADAS) and reduce the amount of in-field road testing. However, existing studies have shown that repeated test execution in the same as well as in distinct simulators can yield different outcomes, which can be attributed to sources of flakiness or different implementations of the physics, among other factors. In this paper, we present MultiSim, a novel approach to multi-simulation ADAS testing based on a search-based testing approach that leverages an ensemble of simulators to identify failure-inducing, simulator-agnostic test scenarios. During the search, each scenario is evaluated jointly on multiple simulators. Scenarios that produce consistent results across simulators are prioritized for further exploration, while those that fail on only a subset of simulators are given less priority, as they may reflect simulator-specific issues rather than generalizable failures. Our case study, which involves testing a deep neural network-based ADAS on different pairs of three widely used simulators, demonstrates that MultiSim outperforms single-simulator testing by achieving on average a higher rate of simulator-agnostic failures by 51%. Compared to a state-of-the-art multi-simulator approach that combines the outcome of independent test generation campaigns obtained in different simulators, MultiSim identifies 54% more simulator-agnostic failing tests while showing a comparable validity rate. An enhancement of MultiSim that leverages surrogate models to predict simulator disagreements and bypass executions does not only increase the average number of valid failures but also improves efficiency in finding the first valid failure.

arxiv情報

著者	Lev Sorokin,Matteo Biagiola,Andrea Stocco
発行日	2025-03-11 22:34:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Simulator Ensembles for Trustworthy Autonomous Driving Testing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー