Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters

要約

教師ありコントラスト損失 (SCL) は、分類におけるクロスエントロピー (CE) 損失に代わる競合的で、多くの場合優れた代替手段です。
この論文では、2 つの異なる損失関数が最適化されているときに学習プロセスにどのような違いが生じるのかを尋ねます。
この質問に答えるために、私たちの主な発見は、クラスあたりのトレーニング例の数に関係なく、SCL によって学習された埋め込みの幾何学が直交フレーム (OF) を形成するということです。
これは、クラスサイズに大きく依存する埋め込みジオメトリを学習することが以前の研究で示されている CE 損失とは対照的です。
我々は、SCL 損失とエントリごとの非負性制約を備えた制約なしの特徴モデルのグローバルミニマイザーが OF を形成することを証明することで、理論的にこの発見に到達しました。
次に、ベンチマーク視覚データセットに対して標準の深層学習モデルを使用した実験を実施することで、モデルの予測を検証します。
最後に、私たちの分析と実験により、SCL トレーニング中に選択されたバッチ処理スキームが、OF ジオメトリへの収束の品質を決定する上で重要な役割を果たすことが明らかになりました。
この発見は、各バッチにいくつかの結合例を追加することで、OF ジオメトリの発生を大幅に高速化する単純なアルゴリズムの動機付けとなります。

要約(オリジナル)

Supervised contrastive loss (SCL) is a competitive and often superior alternative to the cross-entropy (CE) loss for classification. In this paper we ask: what differences in the learning process occur when the two different loss functions are being optimized? To answer this question, our main finding is that the geometry of embeddings learned by SCL forms an orthogonal frame (OF) regardless of the number of training examples per class. This is in contrast to the CE loss, for which previous work has shown that it learns embeddings geometries that are highly dependent on the class sizes. We arrive at our finding theoretically, by proving that the global minimizers of an unconstrained features model with SCL loss and entry-wise non-negativity constraints form an OF. We then validate the model’s prediction by conducting experiments with standard deep-learning models on benchmark vision datasets. Finally, our analysis and experiments reveal that the batching scheme chosen during SCL training plays a critical role in determining the quality of convergence to the OF geometry. This finding motivates a simple algorithm wherein the addition of a few binding examples in each batch significantly speeds up the occurrence of the OF geometry.

arxiv情報

著者	Ganesh Ramachandra Kini,Vala Vakilian,Tina Behnia,Jaidev Gill,Christos Thrampoulidis
発行日	2023-06-13 17:55:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー