A Note on Implementation Errors in Recent Adaptive Attacks Against Multi-Resolution Self-Ensembles


このノートは、多重解像度セルフアンサンブル防御 (Fort および Lakshminarayanan [2024]) に対する最近の適応型攻撃 (Zhang et al. [2024]) の実装上の問題を文書化しています。
この実装では、敵対的な摂動が標準の $L_\infty = 8/255$ を超え、最大 20$\times$ の係数で制限され、最大 $L_\infty = 160/255$ の規模に達することが可能になりました。


This note documents an implementation issue in recent adaptive attacks (Zhang et al. [2024]) against the multi-resolution self-ensemble defense (Fort and Lakshminarayanan [2024]). The implementation allowed adversarial perturbations to exceed the standard $L_\infty = 8/255$ bound by up to a factor of 20$\times$, reaching magnitudes of up to $L_\infty = 160/255$. When attacks are properly constrained within the intended bounds, the defense maintains non-trivial robustness. Beyond highlighting the importance of careful validation in adversarial machine learning research, our analysis reveals an intriguing finding: properly bounded adaptive attacks against strong multi-resolution self-ensembles often align with human perception, suggesting the need to reconsider how we measure adversarial robustness.


著者 Stanislav Fort
発行日 2025-01-24 13:52:37+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CR, cs.CV, cs.LG パーマリンク