RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model

要約

テキストから画像への生成モデルがますます洗練されているため、著作権侵害の基準と保護を定義および施行する際に複雑な課題が生じています。
透かしやデータセットの重複排除などの既存の方法は、標準化された指標の欠如と、拡散モデルにおける著作権侵害への対処の固有の複雑さのため、包括的なソリューションを提供できません。
これらの課題に対処するために、我々は、モデル生成データセットの品質を維持しながら著作権侵害コンテンツの生成を最小限に抑える、テキストから画像への拡散モデルのための強化学習ベースの著作権保護(RLCP)手法を提案します。
私たちのアプローチは、著作権法と侵害に関する判例に基づいた新しい著作権指標の導入から始まります。
次に、ノイズ除去拡散ポリシー最適化 (DDPO) フレームワークを利用して、複数ステップの意思決定プロセスを通じてモデルを導き、提案した著作権メトリックを組み込んだ報酬関数を使用してモデルを最適化します。
さらに、一部の故障モードを軽減し、RL 微調整を安定させるために、正則化項として KL ダイバージェンスを採用します。
著作権画像と非著作権画像の 3 つの混合データセットに対して行われた実験は、私たちのアプローチが画質を維持しながら著作権侵害のリスクを大幅に軽減することを示しています。

要約(オリジナル)

The increasing sophistication of text-to-image generative models has led to complex challenges in defining and enforcing copyright infringement criteria and protection. Existing methods, such as watermarking and dataset deduplication, fail to provide comprehensive solutions due to the lack of standardized metrics and the inherent complexity of addressing copyright infringement in diffusion models. To deal with these challenges, we propose a Reinforcement Learning-based Copyright Protection(RLCP) method for Text-to-Image Diffusion Model, which minimizes the generation of copyright-infringing content while maintaining the quality of the model-generated dataset. Our approach begins with the introduction of a novel copyright metric grounded in copyright law and court precedents on infringement. We then utilize the Denoising Diffusion Policy Optimization (DDPO) framework to guide the model through a multi-step decision-making process, optimizing it using a reward function that incorporates our proposed copyright metric. Additionally, we employ KL divergence as a regularization term to mitigate some failure modes and stabilize RL fine-tuning. Experiments conducted on 3 mixed datasets of copyright and non-copyright images demonstrate that our approach significantly reduces copyright infringement risk while maintaining image quality.

arxiv情報

著者	Zhuan Shi,Jing Yan,Xiaoli Tang,Lingjuan Lyu,Boi Faltings
発行日	2024-08-29 15:39:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー