Generation and Detection of Sign Language Deepfakes – A Linguistic and Visual Analysis

要約

この研究では、上半身世代、特に聴覚障害者とハードの聴覚（DHOH）コミュニティの手話のためのディープファーテクノロジーの肯定的な応用を探ります。
手話の複雑さと専門家の希少性を考えると、生成されたビデオは正確さのために手話の専門家によって吟味されます。
コンピュータービジョンと自然言語処理モデルを使用して、その技術的および視覚的信頼性を評価して、信頼できるディープファークデータセットを構築します。
見られた個人と目に見えない個人の両方をフィーチャーした1200を超えるビデオで構成されるデータセットは、脆弱な個人をターゲットにしたディープファークビデオを検出するためにも使用されます。
専門家の注釈は、生成されたビデオが実際の手話コンテンツに匹敵することを確認しています。
テキストの類似性スコアと通訳評価を使用した言語分析は、生成されたビデオの解釈が本物の手話と少なくとも90％類似していることを示しています。
視覚分析は、新しい主題であっても、説得力のある現実的なディープフェイクを生み出すことができることを示しています。
ポーズ/スタイルの転送モデルを使用して、細部に細心の注意を払い、手の動きが正確であり、運転ビデオと一致するようにします。
また、機械学習アルゴリズムを適用して、このデータセットでディープフェイク検出のベースラインを確立し、不正な手話ビデオの検出に貢献しています。

要約(オリジナル)

This research explores the positive application of deepfake technology for upper body generation, specifically sign language for the Deaf and Hard of Hearing (DHoH) community. Given the complexity of sign language and the scarcity of experts, the generated videos are vetted by a sign language expert for accuracy. We construct a reliable deepfake dataset, evaluating its technical and visual credibility using computer vision and natural language processing models. The dataset, consisting of over 1200 videos featuring both seen and unseen individuals, is also used to detect deepfake videos targeting vulnerable individuals. Expert annotations confirm that the generated videos are comparable to real sign language content. Linguistic analysis, using textual similarity scores and interpreter evaluations, shows that the interpretation of generated videos is at least 90% similar to authentic sign language. Visual analysis demonstrates that convincingly realistic deepfakes can be produced, even for new subjects. Using a pose/style transfer model, we pay close attention to detail, ensuring hand movements are accurate and align with the driving video. We also apply machine learning algorithms to establish a baseline for deepfake detection on this dataset, contributing to the detection of fraudulent sign language videos.

arxiv情報

著者	Shahzeb Naeem,Muhammad Riyyan Khan,Usman Tariq,Abhinav Dhall,Carlos Ivan Colon,Hasan Al-Nashash
発行日	2025-02-17 18:22:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Generation and Detection of Sign Language Deepfakes – A Linguistic and Visual Analysis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー