Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models

要約

迅速なチューニングとアダプターのチューニングは、事前トレーニングされたビジョン言語モデル (VLM) をさまざまな下流タスクに転送する際に大きな可能性を示しています。
この研究では、学習可能な選択を通じてネットワークパラメーターをマスクする、正則化マスクチューニングと呼ばれる新しいタイプのチューニング方法を設計します。
神経経路にヒントを得て、下流のタスクに必要な知識は事前トレーニングされた重みの中にすでに存在しているが、上流の事前トレーニング段階では隠蔽されているだけであると主張します。
有用な知識を再び明らかにするために、最初に特定のダウンストリームタスクにとって重要なパラメータのセットを特定し、次にバイナリマスクを各パラメータに付加し、最後にパラメータを固定したダウンストリームデータに対してこれらのマスクを最適化します。
マスクを更新するとき、モデルが古い知識を忘れて下流のデータを過剰適合するのを防ぐために、新しい勾配ドロップアウト戦略を導入してパラメーターの選択を正規化します。
11 個のデータセットに関する実験結果は、以前の代替方法よりも私たちの方法が一貫して優れていることを示しています。
平均わずか 2.56% のパラメータをマスクすることで、ゼロショット CLIP と比較して 18.73% のパフォーマンス向上を達成できたことは注目に値します。
さらに、私たちの方法は既存のパラメータ効率の高い調整方法のほとんどと相乗効果があり、それらを上回るパフォーマンスを向上させることができます。
プロジェクトページはこちら (https://wuw2019.github.io/RMT/) にあります。

要約(オリジナル)

Prompt tuning and adapter tuning have shown great potential in transferring pre-trained vision-language models (VLMs) to various downstream tasks. In this work, we design a new type of tuning method, termed as regularized mask tuning, which masks the network parameters through a learnable selection. Inspired by neural pathways, we argue that the knowledge required by a downstream task already exists in the pre-trained weights but just gets concealed in the upstream pre-training stage. To bring the useful knowledge back into light, we first identify a set of parameters that are important to a given downstream task, then attach a binary mask to each parameter, and finally optimize these masks on the downstream data with the parameters frozen. When updating the mask, we introduce a novel gradient dropout strategy to regularize the parameter selection, in order to prevent the model from forgetting old knowledge and overfitting the downstream data. Experimental results on 11 datasets demonstrate the consistent superiority of our method over previous alternatives. It is noteworthy that we manage to deliver 18.73% performance improvement compared to the zero-shot CLIP via masking an average of only 2.56% parameters. Furthermore, our method is synergistic with most existing parameter-efficient tuning methods and can boost the performance on top of them. Project page can be found here (https://wuw2019.github.io/RMT/).

arxiv情報

著者	Kecheng Zheng,Wei Wu,Ruili Feng,Kai Zhu,Jiawei Liu,Deli Zhao,Zheng-Jun Zha,Wei Chen,Yujun Shen
発行日	2023-07-27 17:56:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー