CodeCloak: A Method for Evaluating and Mitigating Code Leakage by LLM Code Assistants

要約

LLM ベースのコードアシスタントは、開発者の間でますます人気が高まっています。
これらのツールは、開発者のコードベースに基づいてリアルタイムの提案を提供することで、開発者がコーディング効率を向上させ、エラーを減らすのに役立ちます。
これらのツールを使用すると有益ではありますが、開発プロセス中に開発者の独自コードが誤ってコードアシスタントサービスプロバイダーに公開される可能性があります。
この研究では、LLM ベースのコードアシスタントを使用する場合のコード漏洩のリスクを軽減する方法を提案します。
CodeCloak は、プロンプトをコードアシスタントサービスに送信する前に操作する、新しい深層強化学習エージェントです。
CodeCloak は、(i) コード漏洩を最小限に抑えながら、(ii) 開発者にとって関連性のある有用な提案を保持するという 2 つの相反する目標を達成することを目指しています。
LLM ベースのコードアシスタントモデルである StarCoder と Code Llama を使用した私たちの評価では、さまざまなサイズの多様なコードリポジトリに対する CodeCloak の有効性と、さまざまなモデル間での移行性が実証されました。
また、開発プロセス中にコードアシスタントサービス (プロンプト) に送信されたコードセグメントから開発者の元のコードベースを再構築する方法も設計し、コード漏洩のリスクを徹底的に分析し、実際の開発シナリオで CodeCloak の有効性を評価しました。

要約(オリジナル)

LLM-based code assistants are becoming increasingly popular among developers. These tools help developers improve their coding efficiency and reduce errors by providing real-time suggestions based on the developer’s codebase. While beneficial, the use of these tools can inadvertently expose the developer’s proprietary code to the code assistant service provider during the development process. In this work, we propose a method to mitigate the risk of code leakage when using LLM-based code assistants. CodeCloak is a novel deep reinforcement learning agent that manipulates the prompts before sending them to the code assistant service. CodeCloak aims to achieve the following two contradictory goals: (i) minimizing code leakage, while (ii) preserving relevant and useful suggestions for the developer. Our evaluation, employing StarCoder and Code Llama, LLM-based code assistants models, demonstrates CodeCloak’s effectiveness on a diverse set of code repositories of varying sizes, as well as its transferability across different models. We also designed a method for reconstructing the developer’s original codebase from code segments sent to the code assistant service (i.e., prompts) during the development process, to thoroughly analyze code leakage risks and evaluate the effectiveness of CodeCloak under practical development scenarios.

arxiv情報

著者	Amit Finkman Noah,Avishag Shapira,Eden Bar Kochva,Inbar Maimon,Dudu Mimran,Yuval Elovici,Asaf Shabtai
発行日	2024-10-29 13:43:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CodeCloak: A Method for Evaluating and Mitigating Code Leakage by LLM Code Assistants

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー