MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing

要約

現在の多被験者カスタマイズアプローチには、2つの重大な課題がある。それは、多様な多被験者トレーニングデータを取得することの難しさと、異なる被験者間の属性のもつれである。これらのギャップを埋めるために、我々はMUSARを提案する。MUSARは、単一被験者のトレーニングデータのみを必要としながら、ロバストな多被験者カスタマイズを実現するシンプルかつ効果的なフレームワークである。まず、データの制約を打破するために、非対称二分割学習を導入する。これは、静的アテンションルーティングとデュアルブランチLoRAによって、ディプティーク構築によってもたらされる分布バイアスを積極的に補正しながら、多被験者学習を容易にするために、単一被験者の画像からディプティーク学習ペアを構築する。第二に、被験者間のもつれを解消するために、動的アテンションルーティング機構を導入し、生成された画像と条件付き被験者との間の両対称マッピングを適応的に確立する。この設計により、多被写体表現の分離が達成されるだけでなく、参照被写体が増加してもスケーラブルな汎化性能が維持される。包括的な実験により、我々のMUSARは、単一被験者データセットしか必要としないにもかかわらず、画質、被験者の一貫性、インタラクションの自然さにおいて、既存の手法（多被験者データセットで訓練された手法も含む）を凌駕することが実証された。

要約(オリジナル)

Current multi-subject customization approaches encounter two critical challenges: the difficulty in acquiring diverse multi-subject training data, and attribute entanglement across different subjects. To bridge these gaps, we propose MUSAR – a simple yet effective framework to achieve robust multi-subject customization while requiring only single-subject training data. Firstly, to break the data limitation, we introduce debiased diptych learning. It constructs diptych training pairs from single-subject images to facilitate multi-subject learning, while actively correcting the distribution bias introduced by diptych construction via static attention routing and dual-branch LoRA. Secondly, to eliminate cross-subject entanglement, we introduce dynamic attention routing mechanism, which adaptively establishes bijective mappings between generated images and conditional subjects. This design not only achieves decoupling of multi-subject representations but also maintains scalable generalization performance with increasing reference subjects. Comprehensive experiments demonstrate that our MUSAR outperforms existing methods – even those trained on multi-subject dataset – in image quality, subject consistency, and interaction naturalness, despite requiring only single-subject dataset.

arxiv情報

著者	Zinan Guo,Pengze Zhang,Yanze Wu,Chong Mou,Songtao Zhao,Qian He
発行日	2025-05-05 17:50:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー