Prompt Injection Attacks and Defenses in LLM-Integrated Applications

要約

大規模言語モデル (LLM) は、LLM 統合アプリケーションと呼ばれるさまざまな現実世界のアプリケーションのバックエンドとして導入されることが増えています。
最近の複数の研究では、LLM 統合アプリケーションがプロンプトインジェクション攻撃に対して脆弱であることが示されています。プロンプトインジェクション攻撃とは、攻撃者がアプリケーションの入力に悪意のある命令/データを挿入して、攻撃者の望む結果を生成する攻撃です。
ただし、既存の作品はケーススタディに限られます。
その結果、文献には即時注入攻撃とその防御についての体系的な理解が欠けています。
私たちはこの取り組みでギャップを埋めることを目指しています。
特に、プロンプトインジェクション攻撃を形式化するための一般的なフレームワークを提案します。
研究論文やブログ投稿で議論されている既存の攻撃は、私たちのフレームワークにおける特殊なケースです。
私たちのフレームワークを使用すると、既存の攻撃を組み合わせて新しい攻撃を設計できます。
さらに、プロンプトインジェクション攻撃に対する防御を体系化するフレームワークも提案します。
当社のフレームワークを使用して、10 個の LLM と 7 つのタスクによるプロンプトインジェクション攻撃とその防御に関する体系的な評価を実施します。
私たちのフレームワークがこの分野の将来の研究に刺激を与えることができることを願っています。
私たちのコードは https://github.com/liu00222/Open-Prompt-Injection で入手できます。

要約(オリジナル)

Large Language Models (LLMs) are increasingly deployed as the backend for a variety of real-world applications called LLM-Integrated Applications. Multiple recent works showed that LLM-Integrated Applications are vulnerable to prompt injection attacks, in which an attacker injects malicious instruction/data into the input of those applications such that they produce results as the attacker desires. However, existing works are limited to case studies. As a result, the literature lacks a systematic understanding of prompt injection attacks and their defenses. We aim to bridge the gap in this work. In particular, we propose a general framework to formalize prompt injection attacks. Existing attacks, which are discussed in research papers and blog posts, are special cases in our framework. Our framework enables us to design a new attack by combining existing attacks. Moreover, we also propose a framework to systematize defenses against prompt injection attacks. Using our frameworks, we conduct a systematic evaluation on prompt injection attacks and their defenses with 10 LLMs and 7 tasks. We hope our frameworks can inspire future research in this field. Our code is available at https://github.com/liu00222/Open-Prompt-Injection.

arxiv情報

著者	Yupei Liu,Yuqi Jia,Runpeng Geng,Jinyuan Jia,Neil Zhenqiang Gong
発行日	2023-10-19 15:12:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Prompt Injection Attacks and Defenses in LLM-Integrated Applications

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー