Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning

要約

私たちは、正確で緻密な視覚表現を学習するための新しい自己教師あり学習フレームワークである ADCLR: 正確で緻密なコントラスト表現学習を提案します。
空間に敏感な情報を抽出するために、ADCLR はグローバルコントラストに加えてコントラストのためのクエリパッチを導入します。
以前の密対比手法と比較して、ADCLR は主に 3 つのメリットを享受しています。i) 大域的識別表現と空間敏感表現の両方を達成する、ii) モデル効率が良い (大域的対比ベースラインに加えて追加のパラメーターがない)、および iii) 対応関係が不要
したがって、実装が簡単になります。
私たちのアプローチは、対照的な手法で新たな最先端のパフォーマンスを実現します。
ViT-S の分類タスクでは、ADCLR は線形プローブを使用して ImageNet 上で 77.5% のトップ 1 精度を達成し、プラグインとして考案した技術を使用しないベースライン (DINO) を 0.5% 上回りました。
ViT-B の場合、ADCLR は線形プローブと微調整により ImageNet 上で 79.8%、84.0% の精度を達成し、iBOT の精度を 0.3%、0.2% 上回りました。
高密度タスクの場合、MS-COCO では、ADCLR はオブジェクト検出で AP が 44.3%、インスタンスセグメンテーションで AP が 39.7% という大幅な向上を達成し、以前の SOTA メソッド SelfPatch をそれぞれ 2.2% と 1.2% 上回りました。
ADE20K では、ADCLR はセグメント上で SelfPatch を 1.0% mIoU、1.2% mAcc 上回ります。

要約(オリジナル)

We propose ADCLR: A ccurate and D ense Contrastive Representation Learning, a novel self-supervised learning framework for learning accurate and dense vision representation. To extract spatial-sensitive information, ADCLR introduces query patches for contrasting in addition with global contrasting. Compared with previous dense contrasting methods, ADCLR mainly enjoys three merits: i) achieving both global-discriminative and spatial-sensitive representation, ii) model-efficient (no extra parameters in addition to the global contrasting baseline), and iii) correspondence-free and thus simpler to implement. Our approach achieves new state-of-the-art performance for contrastive methods. On classification tasks, for ViT-S, ADCLR achieves 77.5% top-1 accuracy on ImageNet with linear probing, outperforming our baseline (DINO) without our devised techniques as plug-in, by 0.5%. For ViT-B, ADCLR achieves 79.8%, 84.0% accuracy on ImageNet by linear probing and finetune, outperforming iBOT by 0.3%, 0.2% accuracy. For dense tasks, on MS-COCO, ADCLR achieves significant improvements of 44.3% AP on object detection, 39.7% AP on instance segmentation, outperforming previous SOTA method SelfPatch by 2.2% and 1.2%, respectively. On ADE20K, ADCLR outperforms SelfPatch by 1.0% mIoU, 1.2% mAcc on the segme

arxiv情報

著者	Shaofeng Zhang,Feng Zhu,Rui Zhao,Junchi Yan
発行日	2023-06-23 07:38:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー