Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs

要約

\ textbf {c} ross-llingual \ textbf {b} ackdoor \ textbf {at}タック（x-bat）を多言語の大手言語モデル（MLLM）で探索し、1つの言語で挿入された背景が共有された埋め込みスペースを介して自動的に他の人に移行する方法を明らかにします。
毒性分類をケーススタディとして使用して、攻撃者が単一の言語でデータを中毒することにより多言語システムを損なうことができることを実証します。
私たちの調査結果は、モデルのアーキテクチャに影響を与える重要な脆弱性を明らかにし、情報の流れ中に隠されたバックドア効果をもたらします。
私たちのコードとデータは、公開されているhttps://github.com/himanshubeniwal/x-batです。

要約(オリジナル)

We explore \textbf{C}ross-lingual \textbf{B}ackdoor \textbf{AT}tacks (X-BAT) in multilingual Large Language Models (mLLMs), revealing how backdoors inserted in one language can automatically transfer to others through shared embedding spaces. Using toxicity classification as a case study, we demonstrate that attackers can compromise multilingual systems by poisoning data in a single language, with rare and high-occurring tokens serving as specific, effective triggers. Our findings expose a critical vulnerability that influences the model’s architecture, resulting in a concealed backdoor effect during the information flow. Our code and data are publicly available https://github.com/himanshubeniwal/X-BAT.

arxiv情報

著者	Himanshu Beniwal,Sailesh Panda,Birudugadda Srivibhav,Mayank Singh
発行日	2025-05-20 16:45:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー