Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection

要約

私たちは、入力画像上に描画された指示に従う大規模ビジョン言語モデル (LVLM) の機能を悪意を持って悪用するビジュアルプロンプトインジェクション (VPI) を調査します。
我々は、LVLM の実行タスクを元のタスクから攻撃者が指定した代替タスクに切り替える新しい VPI 手法「ビジュアルプロンプトインジェクションによるゴールハイジャック」(GHVPI) を提案します。
定量的分析では、GPT-4V が GHVPI に対して脆弱であることが示され、15.8% という顕著な攻撃成功率を示しており、これは無視できないセキュリティリスクです。
私たちの分析では、GHVPI を成功させるには、LVLM の高度な文字認識能力と命令追従能力が必要であることも示しています。

要約(オリジナル)

We explore visual prompt injection (VPI) that maliciously exploits the ability of large vision-language models (LVLMs) to follow instructions drawn onto the input image. We propose a new VPI method, ‘goal hijacking via visual prompt injection’ (GHVPI), that swaps the execution task of LVLMs from an original task to an alternative task designated by an attacker. The quantitative analysis indicates that GPT-4V is vulnerable to the GHVPI and demonstrates a notable attack success rate of 15.8%, which is an unignorable security risk. Our analysis also shows that successful GHVPI requires high character recognition capability and instruction-following ability in LVLMs.

arxiv情報

著者	Subaru Kimura,Ryota Tanaka,Shumpei Miyawaki,Jun Suzuki,Keisuke Sakaguchi
発行日	2024-08-07 05:30:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー