Defending Black-box Classifiers by Bayesian Boundary Correction

要約

ディープニューラルネットワークに基づく分類子は、最近、敵対的攻撃の脅威にさらされています。広く存在する脆弱性により、潜在的な脅威から分類子を防御するための研究が行われています。
脆弱な分類子がある場合、既存の防御方法はほとんどがホワイトボックスであり、多くの場合、修正された損失関数/トレーニング体制の下で被害者を再トレーニングする必要があります。
通常、被害者のモデル/データ/トレーニングの詳細はユーザーが利用できませんが、計算リソースが限られているなどの理由から、再トレーニングは不可能ではないにしても、魅力的ではありません。
この目的のために、私たちは新しいブラックボックス防御フレームワークを提案します。
モデルの詳細に関する知識がほとんどなくても、事前トレーニングされた分類子を回復力のある分類子に変えることができます。
これは、結合確率を最大化するために、クリーンデータ、敵対例、および分類子に対する新しい結合ベイジアン処理によって達成されます。
さらに、被害者を無傷に保つ新しい訓練後の戦略も装備されています。
このフレームワークを Bayesian Boundary Correction (BBC) と名付けます。
BBC は、さまざまなデータ型に簡単に適応できる、汎用的で柔軟なフレームワークです。
静的データと動的データの両方について、画像分類とスケルトンベースの人間の活動認識のために BBC をインスタンス化します。
徹底的な評価により、BBC は既存の防御方法と比較して優れた堅牢性を備えており、クリーンな精度を大きく損なうことなく堅牢性を向上させることができます。

要約(オリジナル)

Classifiers based on deep neural networks have been recently challenged by Adversarial Attack, where the widely existing vulnerability has invoked the research in defending them from potential threats. Given a vulnerable classifier, existing defense methods are mostly white-box and often require re-training the victim under modified loss functions/training regimes. While the model/data/training specifics of the victim are usually unavailable to the user, re-training is unappealing, if not impossible for reasons such as limited computational resources. To this end, we propose a new black-box defense framework. It can turn any pre-trained classifier into a resilient one with little knowledge of the model specifics. This is achieved by new joint Bayesian treatments on the clean data, the adversarial examples and the classifier, for maximizing their joint probability. It is further equipped with a new post-train strategy which keeps the victim intact. We name our framework Bayesian Boundary Correction (BBC). BBC is a general and flexible framework that can easily adapt to different data types. We instantiate BBC for image classification and skeleton-based human activity recognition, for both static and dynamic data. Exhaustive evaluation shows that BBC has superior robustness and can enhance robustness without severely hurting the clean accuracy, compared with existing defense methods.

arxiv情報

著者	He Wang,Yunfeng Diao
発行日	2023-06-29 14:33:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Defending Black-box Classifiers by Bayesian Boundary Correction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー