Towards Understanding Unsafe Video Generation

要約

ビデオ生成モデル (VGM) は、高品質の出力を合成する機能を実証しています。
暴力的なビデオや恐ろしいビデオなど、安全でないコンテンツが作成される可能性があることを理解することが重要です。
この作業では、安全でないビデオの生成について包括的に理解します。
まず、これらのモデルが実際に安全でないビデオを生成する可能性を確認するために、4chan と Lexica から収集した安全でないコンテンツ生成プロンプトと、安全でないビデオを生成する 3 つのオープンソース SOTA VGM を選択します。
重複や生成が不十分なコンテンツを除外した後、5,607 個のビデオの元のプールから 2,112 個の安全でないビデオの初期セットを作成しました。
これらの生成されたビデオのクラスタリングとテーマ別コーディング分析を通じて、歪んだ/奇妙、恐ろしい、ポルノ、暴力/流血、政治的な 5 つの危険なビデオカテゴリを特定します。
IRB の承認を得て、生成されたビデオのラベル付けを支援するオンライン参加者を募集します。
403 人の参加者によって提出された注釈に基づいて、最初のビデオセットから 937 個の危険なビデオを特定しました。
ラベル付けされた情報と対応するプロンプトを使用して、VGM によって生成された安全でないビデオの最初のデータセットを作成しました。
次に、安全でないビデオの生成を防ぐために考えられる防御メカニズムを研究します。
画像生成における既存の防御方法は、入力プロンプトまたは出力結果のいずれかをフィルタリングすることに重点を置いています。
私たちは、モデルの内部サンプリングプロセス内で機能する潜在変数防御 (LVD) と呼ばれる新しいアプローチを提案します。
LVD は、多数の安全でないプロンプトをサンプリングする際に、時間とコンピューティングリソースを 10 分の 1 に削減しながら、0.90 の防御精度を達成できます。

要約(オリジナル)

Video generation models (VGMs) have demonstrated the capability to synthesize high-quality output. It is important to understand their potential to produce unsafe content, such as violent or terrifying videos. In this work, we provide a comprehensive understanding of unsafe video generation. First, to confirm the possibility that these models could indeed generate unsafe videos, we choose unsafe content generation prompts collected from 4chan and Lexica, and three open-source SOTA VGMs to generate unsafe videos. After filtering out duplicates and poorly generated content, we created an initial set of 2112 unsafe videos from an original pool of 5607 videos. Through clustering and thematic coding analysis of these generated videos, we identify 5 unsafe video categories: Distorted/Weird, Terrifying, Pornographic, Violent/Bloody, and Political. With IRB approval, we then recruit online participants to help label the generated videos. Based on the annotations submitted by 403 participants, we identified 937 unsafe videos from the initial video set. With the labeled information and the corresponding prompts, we created the first dataset of unsafe videos generated by VGMs. We then study possible defense mechanisms to prevent the generation of unsafe videos. Existing defense methods in image generation focus on filtering either input prompt or output results. We propose a new approach called Latent Variable Defense (LVD), which works within the model’s internal sampling process. LVD can achieve 0.90 defense accuracy while reducing time and computing resources by 10x when sampling a large number of unsafe prompts.

arxiv情報

著者	Yan Pang,Aiping Xiong,Yang Zhang,Tianhao Wang
発行日	2024-07-17 14:07:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Towards Understanding Unsafe Video Generation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー