Gradient constrained sharpness-aware prompt learning for vision-language models

要約

この論文は、視覚言語モデル (VLM) の一般化可能な即時学習における新しいトレードオフ問題、つまり、目に見えるクラスのパフォーマンスを維持しながら、目に見えないクラスのパフォーマンスを向上させることをターゲットとしています。
見られるクラスの劣化を無視する既存の一般化可能な方法と比較して、この問題の設定はより厳密であり、実際のアプリケーションにより密接に適合します。
この問題を解決するには、最適化の観点から開始し、損失ランドスケープのジオメトリとモデルの一般化能力の関係を活用します。
最先端の手法と広く使用されているシャープネスを意識した最小化 (SAM) の損失状況を分析することにより、トレードオフのパフォーマンスは損失値と損失シャープネスの両方に相関していると結論付けています。
不可欠な。
ただし、既存の手法の最適化勾配では、最適化手順全体を通じて損失値と損失シャープネスの両方で高い一貫性を常に維持できるわけではないことがわかりました。
この目的を達成するために、我々は、最適化勾配を動的に制約し、それによって 2 倍以上の最適化目標を同時に達成する、GCSCoOp (Gradient Constrained Sharpness-aware Context Optimization) と呼ばれる、即時学習のための新しい SAM ベースの方法を提案します。
広範な実験により、トレードオフ問題における GCSCoOp の有効性が検証されています。

要約(オリジナル)

This paper targets a novel trade-off problem in generalizable prompt learning for vision-language models (VLM), i.e., improving the performance on unseen classes while maintaining the performance on seen classes. Comparing with existing generalizable methods that neglect the seen classes degradation, the setting of this problem is more strict and fits more closely with practical applications. To solve this problem, we start from the optimization perspective, and leverage the relationship between loss landscape geometry and model generalization ability. By analyzing the loss landscape of the state-of-the-art method and the widely-used Sharpness-aware Minimization (SAM), we conclude that the trade-off performance correlates to both loss value and loss sharpness, while each of them are indispensable. However, we find the optimizing gradient of existing methods cannot always maintain high consistency with both loss value and loss sharpness during the whole optimization procedure. To this end, we propose an novel SAM-based method for prompt learning, denoted as Gradient Constrained Sharpness-aware Context Optimization (GCSCoOp), to dynamically constrains the optimizing gradient, thus achieving above two-fold optimization objective simultaneously. Extensive experiments verify the effectiveness of GCSCoOp in the trade-off problem.

arxiv情報

著者	Liangchen Liu,Nannan Wang,Dawei Zhou,Xinbo Gao,Decheng Liu,Xi Yang,Tongliang Liu
発行日	2023-09-14 17:13:54+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Gradient constrained sharpness-aware prompt learning for vision-language models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー