KV-Edit: Training-Free Image Editing for Precise Background Preservation

要約

背景の一貫性は、画像編集タスクにおける重要な課題のままです。
広範な開発にもかかわらず、既存の作品は、元の画像との類似性を維持することと、ターゲットと一致するコンテンツを生成することとのトレードオフに直面しています。
ここでは、KV-Editを提案します。KV-Editは、KVキャッシュをDITで使用してバックグラウンドの一貫性を維持するトレーニングなしのアプローチを提案します。バックグラウンドトークンが再生されるのではなく保存され、複雑なメカニズムや高価なトレーニングの必要性を排除し、最終的にシームレスに統合する新しいコンテンツを生成することを提案します。
バックグラウンドがユーザーが提供する領域内。
さらに、編集中のKVキャッシュのメモリ消費を調査し、逆転のない方法を使用してスペースの複雑さを$ O（1）$に最適化します。
私たちのアプローチは、追加のトレーニングなしで、DITベースの生成モデルと互換性があります。
実験は、KV-EDITが、背景と画質の両方の点で既存のアプローチを大幅に上回ることを示しています。
プロジェクトWebページは、https：//xilluill.github.io/projectpages/kv-editで入手できます

要約(オリジナル)

Background consistency remains a significant challenge in image editing tasks. Despite extensive developments, existing works still face a trade-off between maintaining similarity to the original image and generating content that aligns with the target. Here, we propose KV-Edit, a training-free approach that uses KV cache in DiTs to maintain background consistency, where background tokens are preserved rather than regenerated, eliminating the need for complex mechanisms or expensive training, ultimately generating new content that seamlessly integrates with the background within user-provided regions. We further explore the memory consumption of the KV cache during editing and optimize the space complexity to $O(1)$ using an inversion-free method. Our approach is compatible with any DiT-based generative model without additional training. Experiments demonstrate that KV-Edit significantly outperforms existing approaches in terms of both background and image quality, even surpassing training-based methods. Project webpage is available at https://xilluill.github.io/projectpages/KV-Edit

arxiv情報

著者	Tianrui Zhu,Shiyi Zhang,Jiawei Shao,Yansong Tang
発行日	2025-02-24 17:40:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

KV-Edit: Training-Free Image Editing for Precise Background Preservation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー