Understanding and Modeling the Effects of Task and Context on Drivers’ Gaze Allocation

要約

ドライバーが何を見ているのかを理解することは、自動運転だけでなく、ドライバーのトレーニング、監視、支援など、多くの用途にとって重要です。
従来、人間の視覚的注意に影響を与える要因は、ボトムアップ (顕著な領域への不本意な魅力) とトップダウン (タスク主導型およびコンテキスト主導型) に分類されてきました。
どちらもドライバーの視線の割り当てに役割を果たしますが、既存のモデリングアプローチのほとんどは、ボトムアップの顕著性のために開発された技術を適用しており、タスクとコンテキストの影響を明示的に考慮していません。
同様に、一般的な運転注意ベンチマークには、関連するタスクとコンテキストの注釈がありません。
したがって、ドライバーの視線予測のためのこれらの要因の分析とモデリングを可能にするために、次のことを提案します。 1) 一般的な DR(eye)VE データセットのいくつかの欠点に対処し、運転タスクとコンテキストに対するフレームごとのアノテーションを使用して拡張します。
2) 顕著性とドライバーの視線予測について、多数のベースラインモデルと SOTA モデルをベンチマークし、それらを基準に基づいて分析します。
新しい注釈。
最後に、3) 明示的なアクションとコンテキスト情報を使用してドライバーの視線予測を調整する新しいモデル。その結果、DR(eye)VE 全体での SOTA パフォーマンスが大幅に向上します (KLD 24%、NSS 89%)。
アクションおよび安全性が重要な交差点シナリオのサブセット (10–30\% KLD による)。
拡張アノテーション、モデルおよび評価用のコードは公開されます。

要約(オリジナル)

Understanding what drivers look at is important for many applications, including driver training, monitoring, and assistance, as well as self-driving. Traditionally, factors affecting human visual attention have been divided into bottom-up (involuntary attraction to salient regions) and top-down (task- and context-driven). Although both play a role in drivers’ gaze allocation, most of the existing modeling approaches apply techniques developed for bottom-up saliency and do not consider task and context influences explicitly. Likewise, common driving attention benchmarks lack relevant task and context annotations. Therefore, to enable analysis and modeling of these factors for drivers’ gaze prediction, we propose the following: 1) address some shortcomings of the popular DR(eye)VE dataset and extend it with per-frame annotations for driving task and context; 2) benchmark a number of baseline and SOTA models for saliency and driver gaze prediction and analyze them w.r.t. the new annotations; and finally, 3) a novel model that modulates drivers’ gaze prediction with explicit action and context information, and as a result significantly improves SOTA performance on DR(eye)VE overall (by 24\% KLD and 89\% NSS) and on a subset of action and safety-critical intersection scenarios (by 10–30\% KLD). Extended annotations, code for model and evaluation will be made publicly available.

arxiv情報

著者	Iuliia Kotseruba,John K. Tsotsos
発行日	2023-10-13 17:38:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Understanding and Modeling the Effects of Task and Context on Drivers’ Gaze Allocation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー