Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction

要約

視覚的なプロンプトは、視覚的なクロスドメインの問題に対処する効率的な方法を提供します。
以前の作品では、ビジュアルドメインプロンプト (VDP) はまず、入力でイメージレベルのプロンプトをワーピングし、ターゲットドメインごとにプロンプトを微調整することにより、分類テスト時適応 (TTA) 問題に取り組むドメインプロンプトを導入しました。
ただし、画像レベルのプロンプトは、プロンプトに割り当てられた領域内の連続した空間詳細をマスクするため、特に密な予測 TTA 問題を扱う場合、不正確なコンテキスト情報と限られた領域知識の抽出に悩まされます。
これらの課題を克服するために、我々は新しい Sparse Visual Domain Prompts (SVDP) アプローチを提案します。これは、画像レベルのプロンプトに最小限のトレーニング可能なパラメーター (例: 0.1\%) を保持し、入力のより多くの空間情報を確保します。
ドメイン固有の知識の抽出に SVDP をより適切に適用するために、分布シフトが大きいピクセルに SVDP のトレーニング可能なパラメーターを適応的に割り当てるドメインプロンプト配置 (DPP) 手法を導入します。
さらに、各ターゲットドメインサンプルが固有のドメインシフトを示すことを認識し、サンプルごとに異なるプロンプトパラメーターを最適化するドメインプロンプトアップデート (DPU) 戦略を設計し、ターゲットドメインへの効率的な適応を促進します。
広く使用されている TTA および継続的 TTA ベンチマークで広範な実験が行われ、私たちが提案した手法は、セマンティックセグメンテーションと深度推定タスクの両方で最先端のパフォーマンスを達成しました。

要約(オリジナル)

The visual prompts have provided an efficient manner in addressing visual cross-domain problems. In previous works, Visual Domain Prompt (VDP) first introduces domain prompts to tackle the classification Test-Time Adaptation (TTA) problem by warping image-level prompts on the input and fine-tuning prompts for each target domain. However, since the image-level prompts mask out continuous spatial details in the prompt-allocated region, it will suffer from inaccurate contextual information and limited domain knowledge extraction, particularly when dealing with dense prediction TTA problems. To overcome these challenges, we propose a novel Sparse Visual Domain Prompts (SVDP) approach, which holds minimal trainable parameters (e.g., 0.1\%) in the image-level prompt and reserves more spatial information of the input. To better apply SVDP in extracting domain-specific knowledge, we introduce the Domain Prompt Placement (DPP) method to adaptively allocates trainable parameters of SVDP on the pixels with large distribution shifts. Furthermore, recognizing that each target domain sample exhibits a unique domain shift, we design Domain Prompt Updating (DPU) strategy to optimize prompt parameters differently for each sample, facilitating efficient adaptation to the target domain. Extensive experiments were conducted on widely-used TTA and continual TTA benchmarks, and our proposed method achieves state-of-the-art performance in both semantic segmentation and depth estimation tasks.

arxiv情報

著者	Senqiao Yang,Jiarui Wu,Jiaming Liu,Xiaoqi Li,Qizhe Zhang,Mingjie Pan,Yulu Gan,Zehui Chen,Shanghang Zhang
発行日	2023-10-02 03:41:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー