CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features

要約

CLIP のようなマルチモーダルエンコーダは、ゼロショット画像分類やクロスモーダル検索などのタスクに優れています。
ただし、過剰なトレーニングデータが必要です。
我々は、限られたデータを使用してマルチモーダルエンコーダーを複製するために 2 つのユニモーダルエンコーダーを使用する正準類似性分析 (CSA) を提案します。
CSA は、新しい類似性スコアを使用して、マルチモーダル情報のみを保持しながら、単峰性の特徴をマルチモーダル空間にマッピングします。
CSA には、ユニモーダルエンコーダーの推論と 3 次複雑さの行列分解のみが含まれるため、大規模な GPU ベースのモデルトレーニングの必要がなくなります。
実験によると、CSA は CLIP よりも優れており、ImageNet 分類と誤った情報を含むニュースキャプションの検出に必要なマルチモーダルデータペアは 300,000 倍 $300,000、ユニモーダルデータは 6 倍少ないことがわかりました。
CSA は、ユニモーダルフィーチャをマルチモーダルフィーチャにマッピングする最先端の方法を超えています。
また、画像やテキストを超えたモダリティを備えた CSA の機能も実証し、限られたペアのマルチモーダルデータと豊富なペアになっていないユニモーダルデータ (LIDAR やテキストなど) を使用した将来のモダリティペアへの道を開きます。

要約(オリジナル)

Multimodal encoders like CLIP excel in tasks such as zero-shot image classification and cross-modal retrieval. However, they require excessive training data. We propose canonical similarity analysis (CSA), which uses two unimodal encoders to replicate multimodal encoders using limited data. CSA maps unimodal features into a multimodal space, using a new similarity score to retain only the multimodal information. CSA only involves the inference of unimodal encoders and a cubic-complexity matrix decomposition, eliminating the need for extensive GPU-based model training. Experiments show that CSA outperforms CLIP while requiring $300,000\times$ fewer multimodal data pairs and $6\times$ fewer unimodal data for ImageNet classification and misinformative news captions detection. CSA surpasses the state-of-the-art method to map unimodal features to multimodal features. We also demonstrate the ability of CSA with modalities beyond image and text, paving the way for future modality pairs with limited paired multimodal data but abundant unpaired unimodal data, such as lidar and text.

arxiv情報

著者	Po-han Li,Sandeep P. Chinchali,Ufuk Topcu
発行日	2024-11-25 17:01:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー