Sparse Fusion Mixture-of-Experts are Domain Generalizable Learners

要約

ドメイン汎化(DG)は、膨大な学習データへの冗長なオーバーフィットを避けるため、分布シフトの下で汎化可能なモデルを学習することを目的としている。複雑な損失設計や勾配制約を用いたこれまでの研究は、大規模ベンチマークでの実証的な成功に至っていない。本研究では、予測特徴の複数の側面をドメイン間で分散的に取り扱うことを活用し、MoEモデルのDG上での汎化可能性を明らかにする。この目的のために、我々はスパース融合Mixture-of-Experts（SF-MoE）を提案する。これは、スパース性と融合メカニズムをMoEフレームワークに組み込み、モデルをスパース性と予測性の両方に維持するものである。SF-MoEには2つの専用モジュールがある。1) スパースブロックと2) フュージョンブロックは、それぞれオブジェクトの多様な学習信号を分離・集約する。大規模な実験により、SF-MoEがドメイン汎用性の高い学習器であることが実証された。また、5つの大規模データセット（DomainNetなど）において、同等もしくはより低い計算コストで、最先端の学習器を2%以上上回る性能を示した。さらに、SF-MoEの内部メカニズムを分散表現の観点から明らかにする（例：視覚属性）。このフレームワークにより、一般化可能な物体認識を実世界に普及させるための今後の研究が促進されることを期待します。コードとモデルは https://github.com/Luodian/SF-MoE-DG で公開されています。

要約(オリジナル)

Domain generalization (DG) aims at learning generalizable models under distribution shifts to avoid redundantly overfitting massive training data. Previous works with complex loss design and gradient constraint have not yet led to empirical success on large-scale benchmarks. In this work, we reveal the mixture-of-experts (MoE) model’s generalizability on DG by leveraging to distributively handle multiple aspects of the predictive features across domains. To this end, we propose Sparse Fusion Mixture-of-Experts (SF-MoE), which incorporates sparsity and fusion mechanisms into the MoE framework to keep the model both sparse and predictive. SF-MoE has two dedicated modules: 1) sparse block and 2) fusion block, which disentangle and aggregate the diverse learned signals of an object, respectively. Extensive experiments demonstrate that SF-MoE is a domain-generalizable learner on large-scale benchmarks. It outperforms state-of-the-art counterparts by more than 2% across 5 large-scale DG datasets (e.g., DomainNet), with the same or even lower computational costs. We further reveal the internal mechanism of SF-MoE from distributed representation perspective (e.g., visual attributes). We hope this framework could facilitate future research to push generalizable object recognition to the real world. Code and models are released at https://github.com/Luodian/SF-MoE-DG.

arxiv情報

著者	Bo Li,Jingkang Yang,Jiawei Ren,Yezhen Wang,Ziwei Liu
発行日	2022-06-08 17:59:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Sparse Fusion Mixture-of-Experts are Domain Generalizable Learners

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー