Video Anomaly Detection in 10 Years: A Survey and Outlook

要約

ビデオ異常検出 (VAD) は、監視、医療、環境モニタリングなどのさまざまな領域にわたって非常に重要です。
多くの調査は従来の VAD 手法に焦点を当てていますが、多くの場合、特定のアプローチや新たな傾向を探る深さが不足しています。
この調査では、ディープラーニングベースの VAD を調査し、従来の教師ありトレーニングパラダイムを超えて、新たな弱教師あり、自己教師あり、教師なしのアプローチを包含します。
このレビューの顕著な特徴は、大規模なデータセット、特徴抽出、学習方法、損失関数、正則化、異常スコア予測など、VAD パラダイム内の中心的な課題を調査していることです。
さらに、このレビューでは、VAD の強力な特徴抽出ツールとしてのビジョン言語モデル (VLM) も調査しています。
VLM は、視覚データをビデオのテキスト説明または音声言語と統合し、異常検出に重要なシーンの微妙な理解を可能にします。
これらの課題に対処し、将来の研究の方向性を提案することで、このレビューは、複雑な現実世界のシナリオにおける異常検出を強化するために VLM の機能を活用する、堅牢で効率的な VAD システムの開発を促進することを目的としています。
この包括的な分析は、既存の知識のギャップを埋め、研究者に貴重な洞察を提供し、VAD 研究の将来の形成に貢献することを目指しています。

要約(オリジナル)

Video anomaly detection (VAD) holds immense importance across diverse domains such as surveillance, healthcare, and environmental monitoring. While numerous surveys focus on conventional VAD methods, they often lack depth in exploring specific approaches and emerging trends. This survey explores deep learning-based VAD, expanding beyond traditional supervised training paradigms to encompass emerging weakly supervised, self-supervised, and unsupervised approaches. A prominent feature of this review is the investigation of core challenges within the VAD paradigms including large-scale datasets, features extraction, learning methods, loss functions, regularization, and anomaly score prediction. Moreover, this review also investigates the vision language models (VLMs) as potent feature extractors for VAD. VLMs integrate visual data with textual descriptions or spoken language from videos, enabling a nuanced understanding of scenes crucial for anomaly detection. By addressing these challenges and proposing future research directions, this review aims to foster the development of robust and efficient VAD systems leveraging the capabilities of VLMs for enhanced anomaly detection in complex real-world scenarios. This comprehensive analysis seeks to bridge existing knowledge gaps, provide researchers with valuable insights, and contribute to shaping the future of VAD research.

arxiv情報

著者	Moshira Abdalla,Sajid Javed,Muaz Al Radi,Anwaar Ulhaq,Naoufel Werghi
発行日	2024-07-01 02:31:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Video Anomaly Detection in 10 Years: A Survey and Outlook

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー