LLMmap: Fingerprinting For Large Language Models

要約

LLM 統合アプリケーションを標的とした第一世代のフィンガープリンティング攻撃である LLMmap を紹介します。
LLMmap はアクティブフィンガープリントアプローチを採用しており、慎重に作成されたクエリをアプリケーションに送信し、その応答を分析して使用中の特定の LLM モデルを識別します。
わずか 8 回のインタラクションで、LLMmap は 95% 以上の精度で LLM を正確に識別できます。
さらに重要なのは、LLMmap はさまざまなアプリケーション層にわたって堅牢になるように設計されており、さまざまなシステムプロンプト、確率的サンプリングハイパーパラメーター、さらには RAG や思考連鎖などの複雑な生成フレームワークの下で動作する LLM を識別できるようになります。

要約(オリジナル)

We introduce LLMmap, a first-generation fingerprinting attack targeted at LLM-integrated applications. LLMmap employs an active fingerprinting approach, sending carefully crafted queries to the application and analyzing the responses to identify the specific LLM model in use. With as few as 8 interactions, LLMmap can accurately identify LLMs with over 95% accuracy. More importantly, LLMmap is designed to be robust across different application layers, allowing it to identify LLMs operating under various system prompts, stochastic sampling hyperparameters, and even complex generation frameworks such as RAG or Chain-of-Thought.

arxiv情報

著者	Dario Pasquini,Evgenios M. Kornaropoulos,Giuseppe Ateniese
発行日	2024-07-24 16:07:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LLMmap: Fingerprinting For Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー