Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection

要約

ドライバーの注意散漫の検出は、交通安全を強化し、交通事故を減らす上で重要な役割を果たすことができる重要なコンピュータービジョンの問題です。
この論文では、ドライバーの注意散漫を検出するためのビジョントランスフォーマー (ViT) ベースのアプローチが提案されています。
具体的には、マルチモーダルビジョントランスフォーマー (ViT-DD) が開発されました。これは、運転者の感情認識だけでなく、注意散漫検出の信号に含まれる誘導情報を利用します。
さらに、セミサプライズ学習アルゴリズムは、感情ラベルのないドライバーデータを ViT-DD の教師付きマルチタスクトレーニングに含めるように設計されています。
SFDDD および AUCDD データセットで実施された広範な実験は、提案された ViT-DD がドライバーの注意散漫検出の最先端のアプローチよりもそれぞれ 6.5% および 0.9% 優れていることを示しています。
ソースコードは https://github.com/PurdueDigitalTwin/ViT-DD で公開されています。

要約(オリジナル)

Driver distraction detection is an important computer vision problem that can play a crucial role in enhancing traffic safety and reducing traffic accidents. In this paper, a Vision Transformer (ViT) based approach for driver distraction detection is proposed. Specifically, a multi-modal Vision Transformer (ViT-DD) is developed, which exploits inductive information contained in signals of distraction detection as well as driver emotion recognition. Further, a semi-surprised learning algorithm is designed to include driver data without emotion labels into the supervised multi-task training of ViT-DD. Extensive experiments conducted on the SFDDD and AUCDD datasets demonstrate that the proposed ViT-DD outperforms the state-of-the-art approaches for driver distraction detection by 6.5% and 0.9%, respectively. Our source code is released at https://github.com/PurdueDigitalTwin/ViT-DD.

arxiv情報

著者	Yunsheng Ma,Ziran Wang
発行日	2022-09-19 16:56:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー