Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

要約

ディープラーニング革命が進む中、自己教師あり学習は、その優れた表現学習能力とラベル付きデータへの依存度の低さのおかげで、近年ますます注目を集めています。
これらのさまざまな自己教師あり手法の中で、マスクモデリングは、トレーニング中に比例的にマスクされる元のデータの部分を予測することを含む独特のアプローチとして浮上しました。
このパラダイムにより、ディープモデルが堅牢な表現を学習できるようになり、コンピュータービジョン、自然言語処理、その他のモダリティのコンテキストで優れたパフォーマンスが実証されました。
この調査では、マスクされたモデリングフレームワークとその方法論の包括的なレビューを示します。
さまざまなマスキング戦略、ターゲットの回復、ネットワークアーキテクチャなどを含む、マスクされたモデリング内のテクニックの詳細について詳しく説明します。
次に、ドメイン全体にわたるその幅広い用途を体系的に調査します。
さらに、さまざまな分野のマスクモデリング手法間の共通点と相違点も調査します。
この論文の終わりに向かって、現在の技術の限界について議論し、マスクされたモデリングの研究を進めるためのいくつかの潜在的な手段を指摘して締めくくります。
この調査を含む論文リストプロジェクトは、\url{https://github.com/Lupin1998/Awesome-MIM} で入手できます。

要約(オリジナル)

As the deep learning revolution marches on, self-supervised learning has garnered increasing attention in recent years thanks to its remarkable representation learning ability and the low dependence on labeled data. Among these varied self-supervised techniques, masked modeling has emerged as a distinctive approach that involves predicting parts of the original data that are proportionally masked during training. This paradigm enables deep models to learn robust representations and has demonstrated exceptional performance in the context of computer vision, natural language processing, and other modalities. In this survey, we present a comprehensive review of the masked modeling framework and its methodology. We elaborate on the details of techniques within masked modeling, including diverse masking strategies, recovering targets, network architectures, and more. Then, we systematically investigate its wide-ranging applications across domains. Furthermore, we also explore the commonalities and differences between masked modeling methods in different fields. Toward the end of this paper, we conclude by discussing the limitations of current techniques and point out several potential avenues for advancing masked modeling research. A paper list project with this survey is available at \url{https://github.com/Lupin1998/Awesome-MIM}.

arxiv情報

著者	Siyuan Li,Luyuan Zhang,Zedong Wang,Di Wu,Lirong Wu,Zicheng Liu,Jun Xia,Cheng Tan,Yang Liu,Baigui Sun,Stan Z. Li
発行日	2024-01-09 16:09:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー