A Tri-Layer Plugin to Improve Occluded Detection

要約

遮られたオブジェクトの検出は、最先端のオブジェクト検出器にとって依然として課題です。
この作業の目的は、そのようなオブジェクトの検出を改善し、それによって最新のオブジェクト検出器の全体的なパフォーマンスを向上させることです。
この目的のために、次の 4 つの貢献を行います。 (1) 部分的に遮られたオブジェクトのリコールを改善するために、2 ステージオブジェクト検出器の検出ヘッド用の単純な「プラグイン」モジュールを提案します。
このモジュールは、ターゲットオブジェクト、オクルーダー、オクルーディーのセグメンテーションマスクの 3 層を予測し、そうすることで、ターゲットオブジェクトのマスクをより適切に予測できます。
(2) オクルージョン関係を確立するために、既存のオブジェクト検出とインスタンスセグメンテーショントレーニングデータセットの非モーダル補完を使用して、モジュールのトレーニングデータを生成するためのスケーラブルなパイプラインを提案します。
(3) また、部分的に遮られ分離されたオブジェクトのリコールパフォーマンスを測定するための COCO 評価データセットを確立します。
(4) 2 段検出器に挿入されたプラグインモジュールは、検出ヘッドを微調整するだけでパフォーマンスを大幅に向上させることができ、アーキテクチャ全体が微調整されている場合はさらに改善されることを示します。
COCO の結果は、Swin-T または Swin-S バックボーンを使用したマスク R-CNN、および Swin-B バックボーンを使用したカスケードマスク R-CNN について報告されています。

要約(オリジナル)

Detecting occluded objects still remains a challenge for state-of-the-art object detectors. The objective of this work is to improve the detection for such objects, and thereby improve the overall performance of a modern object detector. To this end we make the following four contributions: (1) We propose a simple ‘plugin’ module for the detection head of two-stage object detectors to improve the recall of partially occluded objects. The module predicts a tri-layer of segmentation masks for the target object, the occluder and the occludee, and by doing so is able to better predict the mask of the target object. (2) We propose a scalable pipeline for generating training data for the module by using amodal completion of existing object detection and instance segmentation training datasets to establish occlusion relationships. (3) We also establish a COCO evaluation dataset to measure the recall performance of partially occluded and separated objects. (4) We show that the plugin module inserted into a two-stage detector can boost the performance significantly, by only fine-tuning the detection head, and with additional improvements if the entire architecture is fine-tuned. COCO results are reported for Mask R-CNN with Swin-T or Swin-S backbones, and Cascade Mask R-CNN with a Swin-B backbone.

arxiv情報

著者	Guanqi Zhan,Weidi Xie,Andrew Zisserman
発行日	2022-10-18 17:59:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Tri-Layer Plugin to Improve Occluded Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー