Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing

要約

表現学習のための自己教師ありフレームワークは、大規模な衛星画像データセットのキュレーションに伴う高額なラベル付けコストを軽減できる可能性があるため、最近リモートセンシングコミュニティの間で関心を集めています。
マルチモーダルデータフュージョンの領域では、よく使用される対照的な学習手法は、さまざまなセンサータイプ間の領域ギャップを埋めるのに役立ちますが、特にマルチスペクトルリモートセンシングデータの場合、専門知識と慎重な設計を必要とするデータ拡張手法に依存しています。
これらの制限を回避する可能性はあるものの、ほとんど研究されていない方法は、マスクされた画像モデリングに基づいた事前トレーニング戦略を使用することです。
この論文では、マスクされたオートエンコーダに基づく自己教師あり学習フレームワークである Fus-MAE を紹介します。Fus-MAE は、クロスアテンションを使用して、合成開口レーダとマルチスペクトル光学データの間の初期および特徴レベルのデータ融合を実行します。この 2 つのモダリティには、大きな領域ギャップがあります。
。
私たちの経験的発見は、Fus-MAE が SAR 光学データ融合用に調整された対照学習戦略と効果的に競合でき、より大規模なコーパスでトレーニングされた他のマスクされたオートエンコーダーフレームワークよりも優れていることを示しています。

要約(オリジナル)

Self-supervised frameworks for representation learning have recently stirred up interest among the remote sensing community, given their potential to mitigate the high labeling costs associated with curating large satellite image datasets. In the realm of multimodal data fusion, while the often used contrastive learning methods can help bridging the domain gap between different sensor types, they rely on data augmentations techniques that require expertise and careful design, especially for multispectral remote sensing data. A possible but rather scarcely studied way to circumvent these limitations is to use a masked image modelling based pretraining strategy. In this paper, we introduce Fus-MAE, a self-supervised learning framework based on masked autoencoders that uses cross-attention to perform early and feature-level data fusion between synthetic aperture radar and multispectral optical data – two modalities with a significant domain gap. Our empirical findings demonstrate that Fus-MAE can effectively compete with contrastive learning strategies tailored for SAR-optical data fusion and outperforms other masked-autoencoders frameworks trained on a larger corpus.

arxiv情報

著者	Hugo Chan-To-Hing,Bharadwaj Veeravalli
発行日	2024-01-05 11:36:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー