PALP: Prompt Aligned Personalization of Text-to-Image Models

要約

コンテンツ作成者は多くの場合、従来のテキストから画像へのモデルの機能を超えて、個人的な主題を使用してパーソナライズされた画像を作成することを目指しています。
さらに、結果の画像に特定の場所、スタイル、雰囲気などを含めたい場合もあります。
既存のパーソナライゼーション方法では、パーソナライゼーション機能や複雑なテキストプロンプトへの対応が損なわれる可能性があります。
このトレードオフにより、ユーザーのプロンプトの実現や主題の忠実性が妨げられる可能性があります。
この問題に対処するために、\emph{single} プロンプトのパーソナライズ方法に焦点を当てた新しいアプローチを提案します。
私たちは、このアプローチをプロンプトに合わせたパーソナライゼーションと呼んでいます。
これは制限的であるように思えるかもしれませんが、私たちの方法はテキストの配置を改善することに優れており、現在の技術では課題となる可能性がある複雑で複雑なプロンプトを含む画像の作成を可能にします。
特に、私たちの方法では、追加のスコア蒸留サンプリング項を使用して、パーソナライズされたモデルをターゲットプロンプトと一致させ続けます。
マルチショットおよびシングルショット設定におけるこの手法の多用途性を実証し、さらに複数の被写体を合成したり、アートワークなどの参照画像からインスピレーションを利用したりできることを示します。
私たちは、当社のアプローチを既存のベースラインや最先端の技術と定量的および定性的に比較します。

要約(オリジナル)

Content creators often aim to create personalized images using personal subjects that go beyond the capabilities of conventional text-to-image models. Additionally, they may want the resulting image to encompass a specific location, style, ambiance, and more. Existing personalization methods may compromise personalization ability or the alignment to complex textual prompts. This trade-off can impede the fulfillment of user prompts and subject fidelity. We propose a new approach focusing on personalization methods for a \emph{single} prompt to address this issue. We term our approach prompt-aligned personalization. While this may seem restrictive, our method excels in improving text alignment, enabling the creation of images with complex and intricate prompts, which may pose a challenge for current techniques. In particular, our method keeps the personalized model aligned with a target prompt using an additional score distillation sampling term. We demonstrate the versatility of our method in multi- and single-shot settings and further show that it can compose multiple subjects or use inspiration from reference images, such as artworks. We compare our approach quantitatively and qualitatively with existing baselines and state-of-the-art techniques.

arxiv情報

著者	Moab Arar,Andrey Voynov,Amir Hertz,Omri Avrahami,Shlomi Fruchter,Yael Pritch,Daniel Cohen-Or,Ariel Shamir
発行日	2024-01-11 18:35:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

PALP: Prompt Aligned Personalization of Text-to-Image Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー