ViT-DD: Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection

要約

ドライバーの注意散漫の検出は、交通安全を強化し、交通事故を減らす上で重要な役割を果たすことができる重要なコンピュータービジョンの問題です。
この論文では、ビジョントランスフォーマー (ViT) に基づいて、ドライバーの注意散漫を検出するための新しい半教師あり方法を提案します。
具体的には、注意散漫検出やドライバーの感情認識のトレーニング信号に含まれる誘導情報を利用するマルチモーダルビジョントランスフォーマー (ViT-DD) が開発されています。
さらに、自己学習アルゴリズムは、感情ラベルのないドライバーデータを ViT-DD のマルチタスクトレーニングに含めるように設計されています。
SFDDD および AUCDD データセットで実施された広範な実験は、提案された ViT-DD が、ドライバーの注意散漫を検出するための最先端の最良のアプローチよりもそれぞれ 6.5% および 0.9% 優れていることを示しています。
ソースコードは https://github.com/PurdueDigitalTwin/ViT-DD で公開されています。

要約(オリジナル)

Driver distraction detection is an important computer vision problem that can play a crucial role in enhancing traffic safety and reducing traffic accidents. This paper proposes a novel semi-supervised method for detecting driver distractions based on Vision Transformer (ViT). Specifically, a multi-modal Vision Transformer (ViT-DD) is developed that makes use of inductive information contained in training signals of distraction detection as well as driver emotion recognition. Further, a self-learning algorithm is designed to include driver data without emotion labels into the multi-task training of ViT-DD. Extensive experiments conducted on the SFDDD and AUCDD datasets demonstrate that the proposed ViT-DD outperforms the best state-of-the-art approaches for driver distraction detection by 6.5% and 0.9%, respectively. Our source code is released at https://github.com/PurdueDigitalTwin/ViT-DD.

arxiv情報

著者	Yunsheng Ma,Ziran Wang
発行日	2022-09-28 16:16:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ViT-DD: Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー