FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers

要約

整流フローモデルは、画像生成における主要なアプローチとして台頭しており、高品質の画像合成における優れた機能を示しています。
ただし、視覚的な生成には有効であるにもかかわらず、整流フローモデルは画像のもつれを解く編集に苦労することがよくあります。
この制限により、画像の無関係な側面に影響を与えることなく、属性固有の正確な変更を実行することができなくなります。
この論文では、Flux などの整流変換器によって生成された画像のセマンティクスを制御する機能を備えた表現空間を利用する、ドメインに依存しない画像編集手法である FluxSpace を紹介します。
修正されたフローモデル内のトランスフォーマーブロックによって学習された表現を活用することにより、きめの細かい画像編集から芸術的創作に至るまで、幅広い画像編集タスクを可能にする、意味的に解釈可能な一連の表現を提案します。
この取り組みは、スケーラブルで効果的な画像編集アプローチと、もつれを解く機能を提供します。

要約(オリジナル)

Rectified flow models have emerged as a dominant approach in image generation, showcasing impressive capabilities in high-quality image synthesis. However, despite their effectiveness in visual generation, rectified flow models often struggle with disentangled editing of images. This limitation prevents the ability to perform precise, attribute-specific modifications without affecting unrelated aspects of the image. In this paper, we introduce FluxSpace, a domain-agnostic image editing method leveraging a representation space with the ability to control the semantics of images generated by rectified flow transformers, such as Flux. By leveraging the representations learned by the transformer blocks within the rectified flow models, we propose a set of semantically interpretable representations that enable a wide range of image editing tasks, from fine-grained image editing to artistic creation. This work offers a scalable and effective image editing approach, along with its disentanglement capabilities.

arxiv情報

著者	Yusuf Dalva,Kavana Venkatesh,Pinar Yanardag
発行日	2024-12-12 18:59:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー