Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code

要約

大規模な言語モデル（LLM）は、特にコード生成において、ソフトウェア開発を急速に変換しました。
しかし、幻覚や品質の問題に陥りやすい彼らの一貫性のないパフォーマンスは、プログラムの理解を複雑にし、保守性を妨げます。
調査によると、迅速なエンジニアリング – 関連する出力を生成するためにLLMを指示するために入力を設計する実践は、これらの課題に対処するのに役立ちます。
この点で、研究者は、ユーザーがリクエストの策定を導くことを目的とした構造化されたテンプレート、迅速なパターンを導入しました。
ただし、コードの品質に対する迅速なパターンの影響は、まだ徹底的に調査されていません。
この関係の理解の向上は、コード生成にLLMを効果的に使用する方法に関する集合的な知識を進め、それによって現代のソフトウェア開発における理解可能性を向上させるために不可欠です。
このペーパーでは、Dev-GPTデータセットを使用して、コードの品質、特に保守性、セキュリティ、信頼性に対する迅速なパターンの影響を経験的に調査します。
結果は、ゼロショットプロンプトが最も一般的であり、その後、チェーンと少数のショットを備えたゼロショットが続くことを示しています。
品質メトリック全体の7583コードファイルの分析により、最小限の問題が明らかになり、Kruskal-Wallisテストはパターン間の有意差がないことを示しており、迅速な構造がChatGPTアシストコード生成のこれらの品質メトリックに実質的に影響を与えない可能性があることを示唆しています。

要約(オリジナル)

Large Language Models (LLMs) have rapidly transformed software development, especially in code generation. However, their inconsistent performance, prone to hallucinations and quality issues, complicates program comprehension and hinders maintainability. Research indicates that prompt engineering-the practice of designing inputs to direct LLMs toward generating relevant outputs-may help address these challenges. In this regard, researchers have introduced prompt patterns, structured templates intended to guide users in formulating their requests. However, the influence of prompt patterns on code quality has yet to be thoroughly investigated. An improved understanding of this relationship would be essential to advancing our collective knowledge on how to effectively use LLMs for code generation, thereby enhancing their understandability in contemporary software development. This paper empirically investigates the impact of prompt patterns on code quality, specifically maintainability, security, and reliability, using the Dev-GPT dataset. Results show that Zero-Shot prompting is most common, followed by Zero-Shot with Chain-of-Thought and Few-Shot. Analysis of 7583 code files across quality metrics revealed minimal issues, with Kruskal-Wallis tests indicating no significant differences among patterns, suggesting that prompt structure may not substantially impact these quality metrics in ChatGPT-assisted code generation.

arxiv情報

著者	Antonio Della Porta,Stefano Lambiase,Fabio Palomba
発行日	2025-04-18 12:37:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー