Evaluating Vision Transformer Models for Visual Quality Control in Industrial Manufacturing

要約

工業製造における機械学習の最も有望なユースケースの 1 つは、品質管理システムを使用した欠陥製品の早期検出です。
このようなシステムはコストを節約し、目視検査の単調な性質による人的ミスを減らすことができます。
現在、機械学習手法を使用して、バランスの悪い視覚的品質管理データセットからまれに欠陥のある製品を特定する豊富な研究が存在します。
これらの方法は通常 2 つのコンポーネントに依存します。入力画像の特徴を捕捉する視覚的なバックボーンと、これらの特徴が予想される分布内にあるかどうかを判断する異常検出アルゴリズムです。
視覚的なバックボーンとしてのトランスフォーマーアーキテクチャの台頭により、検出品質と推論時間の間のトレードオフに至るまで、これら 2 つのコンポーネントの多種多様な組み合わせが存在するようになりました。
この多様性に直面して、現場の実務者は多くの場合、当面のユースケースに適した組み合わせを研究するのにかなりの時間を費やさなければなりません。
私たちの貢献は、現在のビジョントランスフォーマーモデルと異常検出方法をレビューおよび評価することで、実践者がこの選択を行えるよう支援することです。
このため、私たちは両方の分野の SotA モデルを選択し、それらを組み合わせて、工業生産に適した小型、高速、効率的な異常検出モデルを構築するという目標に向けて評価しました。
私たちは、よく知られている MVTecAD および BTAD データセットに対する実験の結果を評価しました。
さらに、与えられたユースケースとハードウェアの制約を考慮して、実際の品質管理システムに適切なモデルアーキテクチャを選択するためのガイドラインを提供します。

要約(オリジナル)

One of the most promising use-cases for machine learning in industrial manufacturing is the early detection of defective products using a quality control system. Such a system can save costs and reduces human errors due to the monotonous nature of visual inspections. Today, a rich body of research exists which employs machine learning methods to identify rare defective products in unbalanced visual quality control datasets. These methods typically rely on two components: A visual backbone to capture the features of the input image and an anomaly detection algorithm that decides if these features are within an expected distribution. With the rise of transformer architecture as visual backbones of choice, there exists now a great variety of different combinations of these two components, ranging all along the trade-off between detection quality and inference time. Facing this variety, practitioners in the field often have to spend a considerable amount of time on researching the right combination for their use-case at hand. Our contribution is to help practitioners with this choice by reviewing and evaluating current vision transformer models together with anomaly detection methods. For this, we chose SotA models of both disciplines, combined them and evaluated them towards the goal of having small, fast and efficient anomaly detection models suitable for industrial manufacturing. We evaluated the results of our experiments on the well-known MVTecAD and BTAD datasets. Moreover, we give guidelines for choosing a suitable model architecture for a quality control system in practice, considering given use-case and hardware constraints.

arxiv情報

著者	Miriam Alber,Christoph Hönes,Patrick Baier
発行日	2024-11-22 14:12:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Evaluating Vision Transformer Models for Visual Quality Control in Industrial Manufacturing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー