Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback

要約

大規模言語モデル (LLM) は、自動コード生成において目覚ましい進歩を示しています。
しかし、LLM ベースのコード生成を実際のソフトウェアプロジェクトに組み込むには、生成されたコードに API の使用法、クラス、データ構造のエラー、またはプロジェクト固有の情報の欠落が含まれる可能性があるため、課題が生じます。
このプロジェクト固有のコンテキストの多くは LLM のプロンプトに適合できないため、モデルがプロジェクトレベルのコードコンテキストを探索できるようにする方法を見つける必要があります。
この目的を達成するために、この論文は ProCoder と呼ばれる新しいアプローチを提案します。これは、コンパイラーのフィードバックに基づいて、プロジェクトレベルのコードコンテキストを反復的に改良して正確なコードを生成します。
特に、ProCoder はまずコンパイラ技術を活用して、生成されたコードとプロジェクトのコンテキスト間の不一致を特定します。
次に、コードリポジトリから抽出された情報を使用して、特定されたエラーを繰り返し調整して修正します。
ProCoder を 2 つの代表的な LLM、つまり GPT-3.5-Turbo と Code Llama (13B) と統合し、Python コード生成に適用します。
実験結果は、ProCoder がプロジェクトのコンテキストに応じたコード生成においてバニラ LLM を 80% 以上大幅に改善し、既存の検索ベースのコード生成ベースラインを一貫して上回るパフォーマンスを示していることを示しています。

要約(オリジナル)

Large language models (LLMs) have shown remarkable progress in automated code generation. Yet, incorporating LLM-based code generation into real-life software projects poses challenges, as the generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. To this end, this paper puts forward a novel approach, termed ProCoder, which iteratively refines the project-level code context for precise code generation, guided by the compiler feedback. In particular, ProCoder first leverages compiler techniques to identify a mismatch between the generated code and the project’s context. It then iteratively aligns and fixes the identified errors using information extracted from the code repository. We integrate ProCoder with two representative LLMs, i.e., GPT-3.5-Turbo and Code Llama (13B), and apply it to Python code generation. Experimental results show that ProCoder significantly improves the vanilla LLMs by over 80% in generating code dependent on project context, and consistently outperforms the existing retrieval-based code generation baselines.

arxiv情報

著者	Zhangqian Bi,Yao Wan,Zheng Wang,Hongyu Zhang,Batu Guan,Fangxin Lu,Zili Zhang,Yulei Sui,Xuanhua Shi,Hai Jin
発行日	2024-03-25 14:07:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー