Learning Multi-view Anomaly Detection

要約

この研究では、最近提案された困難なマルチビュー異常検出 (AD) タスクについて調査します。
単一ビューのタスクでは、他の視点からの盲点が発生し、サンプルレベルの予測が不正確になります。
したがって、マルチビューから機能を学習して統合する \textbf{M}ulti-\textbf{V}iew \textbf{A}nomaly \textbf{D}etection (\textbf{MVAD}) フレームワークを導入します。
具体的には、複数のビューにわたる特徴学習と融合のための \textbf{M}ulti-\textbf{V}iew \textbf{A}daptive \textbf{S}election (\textbf{MVAS}) アルゴリズムを提案しました。
特徴マップは、シングルビューウィンドウと他のすべてのビューの間の意味相関行列を計算するために近傍アテンションウィンドウに分割されます。これは、各シングルビューウィンドウと最も相関の高い上位 K 個のマルチビューウィンドウに対する伝導性アテンションメカニズムです。
ウィンドウサイズとトップ K を調整すると、計算の複雑さを線形に最小限に抑えることができます。
Real-IAD データセットのクロス設定 (マルチ/シングルクラス) に対する広範な実験により、私たちのアプローチの有効性が検証され、サンプル \textbf{4.1\%}$\uparrow$/ 画像の中で最先端のパフォーマンスが達成されました。
\textbf{5.6\%}$\uparrow$/pixel \textbf{6.7\%}$\uparrow$ レベルには、\textbf{18M} パラメータのみを使用し、GPU メモリとトレーニング時間が短縮され、合計 10 個のメトリクスがあります。

要約(オリジナル)

This study explores the recently proposed challenging multi-view Anomaly Detection (AD) task. Single-view tasks would encounter blind spots from other perspectives, resulting in inaccuracies in sample-level prediction. Therefore, we introduce the \textbf{M}ulti-\textbf{V}iew \textbf{A}nomaly \textbf{D}etection (\textbf{MVAD}) framework, which learns and integrates features from multi-views. Specifically, we proposed a \textbf{M}ulti-\textbf{V}iew \textbf{A}daptive \textbf{S}election (\textbf{MVAS}) algorithm for feature learning and fusion across multiple views. The feature maps are divided into neighbourhood attention windows to calculate a semantic correlation matrix between single-view windows and all other views, which is a conducted attention mechanism for each single-view window and the top-K most correlated multi-view windows. Adjusting the window sizes and top-K can minimise the computational complexity to linear. Extensive experiments on the Real-IAD dataset for cross-setting (multi/single-class) validate the effectiveness of our approach, achieving state-of-the-art performance among sample \textbf{4.1\%}$\uparrow$/ image \textbf{5.6\%}$\uparrow$/pixel \textbf{6.7\%}$\uparrow$ levels with a total of ten metrics with only \textbf{18M} parameters and fewer GPU memory and training time.

arxiv情報

著者	Haoyang He,Jiangning Zhang,Guanzhong Tian,Chengjie Wang,Lei Xie
発行日	2024-07-16 17:26:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Learning Multi-view Anomaly Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー