Towards Calibrated Robust Fine-Tuning of Vision-Language Models

要約

堅牢な微調整は、配布外 (OOD) サンプルでのパフォーマンスを確保することを目的としていますが、配布内 (ID) サンプルでの適応を追求することでパフォーマンスが損なわれる場合があります。
しかし、信頼性の高い機械学習のもう 1 つの基準である信頼度の調整は、自動運転など現実世界の一か八かのアプリケーションに対する需要が高まっているにもかかわらず、見落とされてきました。
我々は、単純な微調整や最先端の堅牢な微調整さえも、特に事前トレーニングされた VLM の調整に悪影響を与えることを示すことで、分布シフト下での微調整された視覚言語モデル (VLM) の調整についての懸念を提起します。
OOD データセットについて。
まず、OOD キャリブレーションエラーが ID キャリブレーションエラーと ID と OOD 間のドメインの不一致によって上から制限されていることを示します。
この分析から、我々は、OOD キャリブレーション誤差の上限を削減するために、ドメイン全体にわたる ID キャリブレーションとロバストな予測を促進する、キャリブレーションされたロバストな微調整方法である CaRot を提案します。
ImageNet-1K 分類における 3 種類の分布シフト (自然、合成、敵対的) に関する広範な実験により、多様な環境にわたる CaRot の有効性が実証されました。
私たちは、理論的分析を通じて CaRot の実証的な成功を正当化します。

要約(オリジナル)

Robust fine-tuning aims to ensure performance on out-of-distribution (OOD) samples, which is sometimes compromised by pursuing adaptation on in-distribution (ID) samples. However, another criterion for reliable machine learning — confidence calibration has been overlooked despite its increasing demand for real-world high-stakes applications, e.g., autonomous driving. We raise concerns about the calibration of fine-tuned vision-language models (VLMs) under distribution shift by showing that naive fine-tuning and even state-of-the-art robust fine-tuning hurt the calibration of pre-trained VLMs, especially on OOD datasets. We first show the OOD calibration error is bounded from above with ID calibration errors and domain discrepancy between ID and OOD. From this analysis, we propose CaRot, a calibrated robust fine-tuning method that incentivizes ID calibration and robust prediction across domains to reduce the upper bound of OOD calibration error. Extensive experiments on three types of distribution shifts (natural, synthetic, and adversarial) on ImageNet-1K classification demonstrate the effectiveness of CaRot across diverse environments. We justify the empirical success of CaRot through our theoretical analysis.

arxiv情報

著者	Changdae Oh,Hyesu Lim,Mijoo Kim,Jaegul Choo,Alexander Hauptmann,Zhi-Qi Cheng,Kyungwoo Song
発行日	2024-02-12 02:57:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Towards Calibrated Robust Fine-Tuning of Vision-Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー