Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models

要約

潜在的な拡散モデル（LDMS）の生成プロセスに透かしを統合すると、生成されたコンテンツの検出と帰属が簡素化されます。
ツリーリングやガウスシェーディングなどのセマンティックの透かしは、実装しやすく、さまざまな摂動に対して非常に堅牢な透かしのテクニックの新しいクラスを表しています。
しかし、私たちの仕事は、セマンティック透かしの基本的なセキュリティの脆弱性を示しています。
攻撃者は、異なる潜在スペースやアーキテクチャ（UNET対DIT）がある場合でも、無関係なモデルを活用して、強力で現実的な偽造攻撃を実行できることを示しています。
具体的には、2つの透かしの偽造攻撃を設計します。
最初のものは、透けていないLDMの任意の画像の潜在的な表現を操作して、透かし式画像の潜在的な表現に近づくことにより、実際の画像にターゲットを絞った透かしを刻印します。
また、この手法を透かし除去に使用できることも示しています。
2番目の攻撃は、透かし式画像を反転させ、任意のプロンプトで再生することにより、ターゲットウォーターマークで新しい画像を生成します。
どちらの攻撃でも、ターゲットウォーターマークを備えた単一の参照画像が必要です。
全体として、私たちの調査結果は、攻撃者が現実的な条件下でこれらの透かしを簡単に築き、削除できることを明らかにすることにより、セマンティック透かしの適用性に疑問を呈しています。

要約(オリジナル)

Integrating watermarking into the generation process of latent diffusion models (LDMs) simplifies detection and attribution of generated content. Semantic watermarks, such as Tree-Rings and Gaussian Shading, represent a novel class of watermarking techniques that are easy to implement and highly robust against various perturbations. However, our work demonstrates a fundamental security vulnerability of semantic watermarks. We show that attackers can leverage unrelated models, even with different latent spaces and architectures (UNet vs DiT), to perform powerful and realistic forgery attacks. Specifically, we design two watermark forgery attacks. The first imprints a targeted watermark into real images by manipulating the latent representation of an arbitrary image in an unrelated LDM to get closer to the latent representation of a watermarked image. We also show that this technique can be used for watermark removal. The second attack generates new images with the target watermark by inverting a watermarked image and re-generating it with an arbitrary prompt. Both attacks just need a single reference image with the target watermark. Overall, our findings question the applicability of semantic watermarks by revealing that attackers can easily forge or remove these watermarks under realistic conditions.

arxiv情報

著者	Andreas Müller,Denis Lukovnikov,Jonas Thietke,Asja Fischer,Erwin Quiring
発行日	2025-03-26 15:10:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー