Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

要約

タイトル：人間中心の潜在拡散モデルによるファッション画像編集のためのマルチモーダルガーメントデザイナー

要約：

– ファッションイラストレーションは、デザイナーが彼らのビジョンを伝え、衣服が人体とどのように相互作用するかを示すことで、コンセプチュアリゼーションから実現までのデザインアイデアを伝えるために使用される。
– このような文脈で、コンピュータビジョンは、ファッションデザインプロセスを改善するために使用することができる。従来の仕事とは異なり、主に衣服の仮想試着に焦点を当てたものとは異なり、私たちは、マルチモーダル条件付けファッションイメージ編集のタスクを提案し、テキスト、人間の体のポーズ、衣服のスケッチなどのマルチモーダルなプロンプトに従って人間中心のファッションイメージの生成を指導する。
– 私たちは、ファッションドメインで従来使用されていなかったアプローチである潜在的な拡散モデルに基づく新しいアーキテクチャを提案することによって、この問題に取り組んでいます。
– タスクに適した既存のデータセットが存在しないため、私たちは、半自動的に収集されたマルチモーダル注釈であるDress CodeとVITON-HDの2つの既存のファッションデータセットを拡張しました。
– これらの新しいデータセットでの実験結果は、私たちの提案の現実性とマルチモーダルな入力との一貫性の両方において、その効果を実証しています。
– ソースコードと収集されたマルチモーダル注釈は、https://github.com/aimagelab/multimodal-garment-designerで公開されます。

要約(オリジナル)

Fashion illustration is used by designers to communicate their vision and to bring the design idea from conceptualization to realization, showing how clothes interact with the human body. In this context, computer vision can thus be used to improve the fashion design process. Differently from previous works that mainly focused on the virtual try-on of garments, we propose the task of multimodal-conditioned fashion image editing, guiding the generation of human-centric fashion images by following multimodal prompts, such as text, human body poses, and garment sketches. We tackle this problem by proposing a new architecture based on latent diffusion models, an approach that has not been used before in the fashion domain. Given the lack of existing datasets suitable for the task, we also extend two existing fashion datasets, namely Dress Code and VITON-HD, with multimodal annotations collected in a semi-automatic manner. Experimental results on these new datasets demonstrate the effectiveness of our proposal, both in terms of realism and coherence with the given multimodal inputs. Source code and collected multimodal annotations will be publicly released at: https://github.com/aimagelab/multimodal-garment-designer.

arxiv情報

著者	Alberto Baldrati,Davide Morelli,Giuseppe Cartella,Marcella Cornia,Marco Bertini,Rita Cucchiara
発行日	2023-04-04 18:03:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー