Long-tailed Visual Recognition via Gaussian Clouded Logit Adjustment

要約

ディープニューラルネットワークはバランスの取れたデータで大きな成功を収めていますが、ロングテールデータは依然として大きな課題です。
クロスエントロピー損失のあるロングテールデータのバニラトレーニングでは、インスタンスが豊富なヘッドクラスがテールクラスの空間分布を大幅に圧迫し、テールクラスサンプルの分類が困難になることが観察されています。
さらに、ソフトマックス形式の勾配は、ロジット差が増加するにつれて急速にゼロに近づくため、元のクロスエントロピー損失は勾配を短時間しか伝播できません。
この現象はソフトマックス飽和と呼ばれます。
これは平衡データのトレーニングには不利ですが、ロングテールデータのサンプルの有効性を調整するために利用でき、それによってロングテール問題の歪んだ埋め込み空間を解決できます。
この目的を達成するために、本論文では、振幅を変えた異なるクラスロジットのガウス摂動によるガウス曇りロジット調整を提案する。
摂動の振幅を雲のサイズとして定義し、比較的大きな雲のサイズを尾部クラスに設定します。
雲のサイズが大きいとソフトマックスの飽和が減少し、それによってテールクラスのサンプルがよりアクティブになるだけでなく、埋め込み空間が拡大する可能性があります。
したがって、分類器の偏りを軽減するために、分類器の再トレーニングを使用したクラスベースの有効数サンプリング戦略を提案します。
ベンチマークデータセットに対する広範な実験により、提案された方法の優れたパフォーマンスが検証されています。
ソースコードは https://github.com/Keke921/GCLLoss で入手できます。

要約(オリジナル)

Long-tailed data is still a big challenge for deep neural networks, even though they have achieved great success on balanced data. We observe that vanilla training on long-tailed data with cross-entropy loss makes the instance-rich head classes severely squeeze the spatial distribution of the tail classes, which leads to difficulty in classifying tail class samples. Furthermore, the original cross-entropy loss can only propagate gradient short-lively because the gradient in softmax form rapidly approaches zero as the logit difference increases. This phenomenon is called softmax saturation. It is unfavorable for training on balanced data, but can be utilized to adjust the validity of the samples in long-tailed data, thereby solving the distorted embedding space of long-tailed problems. To this end, this paper proposes the Gaussian clouded logit adjustment by Gaussian perturbation of different class logits with varied amplitude. We define the amplitude of perturbation as cloud size and set relatively large cloud sizes to tail classes. The large cloud size can reduce the softmax saturation and thereby making tail class samples more active as well as enlarging the embedding space. To alleviate the bias in a classifier, we therefore propose the class-based effective number sampling strategy with classifier re-training. Extensive experiments on benchmark datasets validate the superior performance of the proposed method. Source code is available at https://github.com/Keke921/GCLLoss.

arxiv情報

著者	Mengke Li,Yiu-ming Cheung,Yang Lu
発行日	2023-05-19 15:11:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Long-tailed Visual Recognition via Gaussian Clouded Logit Adjustment

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー