StyleHumanCLIP: Text-guided Garment Manipulation for StyleGAN-Human

要約

この論文では、全身人物画像の衣服を編集するための StyleGAN のテキストガイドによる制御に取り組みます。
既存の StyleGAN ベースの方法では、衣服、体型、ポーズの豊富な多様性を処理することが困難です。
我々は、アテンションベースの潜在コードマッパーを介したテキストガイドによる全身人物画像合成のフレームワークを提案します。これにより、既存のマッパーよりもStyleGANのより解きほぐされた制御が可能になります。
私たちの潜在コードマッパーは、テキストのガイダンスの下で、さまざまな StyleGAN レイヤー上の個々の潜在コードを適応的に操作するアテンションメカニズムを採用しています。
さらに、テキスト入力によって引き起こされる不要な変更を回避するために、推論時に特徴空間マスキングを導入します。
定量的および定性的な評価により、本手法は既存の手法よりもテキストに忠実に生成された画像を制御できることが明らかになりました。

要約(オリジナル)

This paper tackles text-guided control of StyleGAN for editing garments in full-body human images. Existing StyleGAN-based methods suffer from handling the rich diversity of garments and body shapes and poses. We propose a framework for text-guided full-body human image synthesis via an attention-based latent code mapper, which enables more disentangled control of StyleGAN than existing mappers. Our latent code mapper adopts an attention mechanism that adaptively manipulates individual latent codes on different StyleGAN layers under text guidance. In addition, we introduce feature-space masking at inference time to avoid unwanted changes caused by text inputs. Our quantitative and qualitative evaluations reveal that our method can control generated images more faithfully to given texts than existing methods.

arxiv情報

著者	Takato Yoshikawa,Yuki Endo,Yoshihiro Kanamori
発行日	2023-05-26 09:21:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

StyleHumanCLIP: Text-guided Garment Manipulation for StyleGAN-Human

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー