Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic

要約

暗号化されたネットワーク通信により、エンドポイント間の機密性、完全性、プライバシーが確保されます。
しかし、攻撃者は悪意のある動作を隠すために暗号化を悪用することが増えています。
ペイロードを復号化せずに未知の暗号化された悪意のあるトラフィックを検出することは、依然として大きな課題です。
この研究では、悪意のあるネットワークトラフィックを検出するための説明可能な人工知能 (XAI) 技術の統合を調査します。
アンサンブル学習モデルを採用し、暗号化通信のさまざまな側面から抽出されたマルチビューの特徴を使用して悪意のあるアクティビティを特定します。
悪意のある通信を効果的に表現するために、他のどのオープンソースデータセットよりも多く、54 のマルウェアファミリにまたがる 1,127 の一意の接続を含む堅牢なデータセットをコンパイルしました。
当社のモデルは CTU-13 データセットに対してベンチマークされ、99% 以上の精度、精度、F1 スコアのパフォーマンスを達成しました。
さらに、eXtreme Gradient Boosting (XGB) モデルは、カスタムデータセットで 99.32% の精度、99.53% の精度、99.43% の F1 スコアを実証しました。
Shapley Additive Explains (SHAP) を活用することで、最大パケットサイズ、パケットの平均到着間隔、使用されるトランスポート層セキュリティバージョンが、グローバルモデルの説明にとって最も重要な特徴であることがわかりました。
さらに、主要な特徴は、個々のトラフィックサンプルの両方のデータセットにわたるローカルな説明にとって重要であることが特定されました。
これらの洞察により、モデルの意思決定プロセスをより深く理解できるようになり、悪意のある暗号化トラフィックの検出の透明性と信頼性が向上します。

要約(オリジナル)

Encrypted network communication ensures confidentiality, integrity, and privacy between endpoints. However, attackers are increasingly exploiting encryption to conceal malicious behavior. Detecting unknown encrypted malicious traffic without decrypting the payloads remains a significant challenge. In this study, we investigate the integration of explainable artificial intelligence (XAI) techniques to detect malicious network traffic. We employ ensemble learning models to identify malicious activity using multi-view features extracted from various aspects of encrypted communication. To effectively represent malicious communication, we compiled a robust dataset with 1,127 unique connections, more than any other available open-source dataset, and spanning 54 malware families. Our models were benchmarked against the CTU-13 dataset, achieving performance of over 99% accuracy, precision, and F1-score. Additionally, the eXtreme Gradient Boosting (XGB) model demonstrated 99.32% accuracy, 99.53% precision, and 99.43% F1-score on our custom dataset. By leveraging Shapley Additive Explanations (SHAP), we identified that the maximum packet size, mean inter-arrival time of packets, and transport layer security version used are the most critical features for the global model explanation. Furthermore, key features were identified as important for local explanations across both datasets for individual traffic samples. These insights provide a deeper understanding of the model decision-making process, enhancing the transparency and reliability of detecting malicious encrypted traffic.

arxiv情報

著者	Sileshi Nibret Zeleke,Amsalu Fentie Jember,Mario Bochicchio
発行日	2025-01-09 17:21:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー