Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study

要約

脆弱性を検出するためにさまざまなアプローチが採用されているにもかかわらず、報告された脆弱性の数は年々増加傾向を示しています。
これは、コードがリリースされる前に問題が発見されなかったことを示唆しています。その原因としては、認識の欠如、既存の脆弱性検出ツールの有効性が限られていること、ツールがユーザーフレンドリーではないことなど、さまざまな要因が考えられます。
従来の脆弱性検出ツールのいくつかの問題に対処するために、ソースコード内の脆弱性の発見を支援する大規模言語モデル (LLM) を使用することを提案します。
LLM はコードを理解して生成する優れた能力を示しており、コード関連のタスクにおける LLM の可能性を強調しています。
目的は、複数の最先端の LLM をテストし、最適なプロンプト戦略を特定し、LLM から最適な値を抽出できるようにすることです。
LLM ベースのアプローチの長所と短所の概要を示し、その結果を従来の静的解析ツールの結果と比較します。
LLM は従来の静的分析ツールよりも多くの問題を正確に特定でき、再現率と F1 スコアの点で従来のツールを上回るパフォーマンスを示していることがわかりました。
この結果は、コードに脆弱性がないことを保証する責任を負うソフトウェア開発者やセキュリティアナリストに有益となるはずです。

要約(オリジナル)

Despite various approaches being employed to detect vulnerabilities, the number of reported vulnerabilities shows an upward trend over the years. This suggests the problems are not caught before the code is released, which could be caused by many factors, like lack of awareness, limited efficacy of the existing vulnerability detection tools or the tools not being user-friendly. To help combat some issues with traditional vulnerability detection tools, we propose using large language models (LLMs) to assist in finding vulnerabilities in source code. LLMs have shown a remarkable ability to understand and generate code, underlining their potential in code-related tasks. The aim is to test multiple state-of-the-art LLMs and identify the best prompting strategies, allowing extraction of the best value from the LLMs. We provide an overview of the strengths and weaknesses of the LLM-based approach and compare the results to those of traditional static analysis tools. We find that LLMs can pinpoint many more issues than traditional static analysis tools, outperforming traditional tools in terms of recall and F1 scores. The results should benefit software developers and security analysts responsible for ensuring that the code is free of vulnerabilities.

arxiv情報

著者	Karl Tamberg,Hayretdin Bahsi
発行日	2024-05-24 14:59:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー