2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

要約

この技術レポートでは、CVPR2023 Visual Anomaly and Novelty Detection (VAND) チャレンジで Segment Any Anomaly チームが優勝したソリューションを紹介します。
言語プロンプトなどの単一モーダルプロンプトを超えて、カスケードされた最新の基礎モデルの正則化のためのマルチモーダルプロンプトを使用したゼロショット異常セグメンテーションのための新しいフレームワーク、つまり Segment Any Anomaly + (SAA$+$) を提示します。
。
Segment Anything のような基盤モデルの優れたゼロショット一般化機能に触発され、まずそのアセンブリ (SAA) を探索し、異常位置特定のために多様なマルチモーダル事前知識を活用します。
続いて、ドメインの専門知識とターゲット画像コンテキストから導き出されたマルチモーダルプロンプト (SAA$+$) をさらに導入し、基礎モデルの異常セグメンテーションへのノンパラメーター適応を可能にします。
提案された SAA$+$ モデルは、ゼロショット設定で、VisA や MVTec-AD を含むいくつかの異常セグメンテーションベンチマークで最先端のパフォーマンスを達成します。
CVPR2023 VAN の受賞ソリューションのコードをリリースします。

要約(オリジナル)

This technical report introduces the winning solution of the team Segment Any Anomaly for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge. Going beyond uni-modal prompt, e.g., language prompt, we present a novel framework, i.e., Segment Any Anomaly + (SAA$+$), for zero-shot anomaly segmentation with multi-modal prompts for the regularization of cascaded modern foundation models. Inspired by the great zero-shot generalization ability of foundation models like Segment Anything, we first explore their assembly (SAA) to leverage diverse multi-modal prior knowledge for anomaly localization. Subsequently, we further introduce multimodal prompts (SAA$+$) derived from domain expert knowledge and target image context to enable the non-parameter adaptation of foundation models to anomaly segmentation. The proposed SAA$+$ model achieves state-of-the-art performance on several anomaly segmentation benchmarks, including VisA and MVTec-AD, in the zero-shot setting. We will release the code of our winning solution for the CVPR2023 VAN.

arxiv情報

著者	Yunkang Cao,Xiaohao Xu,Chen Sun,Yuqi Cheng,Liang Gao,Weiming Shen
発行日	2023-09-05 14:44:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー