GenDFIR: Advancing Cyber Incident Timeline Analysis Through Retrieval Augmented Generation and Large Language Models

要約

サイバータイムライン分析、またはフォレンジックタイムライン分析は、デジタルフォレンジックおよびインシデント対応 (DFIR) において重要です。
アーティファクトとイベント、特にタイムスタンプとメタデータを検査して、異常を検出し、相関関係を確立し、インシデントのタイムラインを再構築します。
従来の方法は、ログやファイルシステムのメタデータなどの構造化されたアーティファクトに依存し、証拠の特定と特徴抽出に専用のツールを使用します。
このペーパーでは、大規模言語モデル (LLM)、特に検索拡張生成 (RAG) エージェントと統合されたゼロショットモードの Llama 3.1 8B を活用するフレームワークである GenDFIR を紹介します。
インシデントデータは構造化されたナレッジベースに前処理され、RAG エージェントがユーザープロンプトに基づいて関連イベントを取得できるようになります。
LLM はこのコンテキストを解釈し、セマンティックな強化を提供します。
制御された環境で合成データでテストされた結果は、GenDFIR の信頼性と堅牢性を実証し、タイムライン分析を自動化し、脅威検出を高度化する LLM の可能性を示しています。

要約(オリジナル)

Cyber timeline analysis, or forensic timeline analysis, is crucial in Digital Forensics and Incident Response (DFIR). It examines artefacts and events particularly timestamps and metadata to detect anomalies, establish correlations, and reconstruct incident timelines. Traditional methods rely on structured artefacts, such as logs and filesystem metadata, using specialised tools for evidence identification and feature extraction. This paper introduces GenDFIR, a framework leveraging large language models (LLMs), specifically Llama 3.1 8B in zero shot mode, integrated with a Retrieval-Augmented Generation (RAG) agent. Incident data is preprocessed into a structured knowledge base, enabling the RAG agent to retrieve relevant events based on user prompts. The LLM interprets this context, offering semantic enrichment. Tested on synthetic data in a controlled environment, results demonstrate GenDFIR’s reliability and robustness, showcasing LLMs potential to automate timeline analysis and advance threat detection.

arxiv情報

著者	Fatma Yasmine Loumachi,Mohamed Chahine Ghanem,Mohamed Amine Ferrag
発行日	2024-12-27 13:29:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GenDFIR: Advancing Cyber Incident Timeline Analysis Through Retrieval Augmented Generation and Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー