Execution-based Code Generation using Deep Reinforcement Learning

要約

ソフトウェアエンジニアリングプロセスを自動化する手段として、大規模なコードコーパスで事前トレーニングされたプログラミング言語 (PL) モデルの利用は、コード補完、コード変換、プログラム合成などのさまざまなコード生成タスクを合理化する上で大きな可能性を示しています。
ただし、現在のアプローチは主に、テキスト生成から借用した監視された微調整の目的に依存しており、コンパイル可能性や構文的および機能的な正確性を含むがこれらに限定されないコードの特定のシーケンスレベルの機能を無視しています。
この制限に対処するために、PPOCoder を提案します。これは、事前トレーニング済みの PL モデルを近接方策最適化 (PPO) の深層強化学習と組み合わせ、実行フィードバックを外部の知識源としてモデルの最適化に使用するコード生成の新しいフレームワークです。
PPOCoder は、さまざまなコード生成タスクと PL 間で転送可能です。
3 つのコード生成タスクに関する広範な実験により、SOTA メソッドと比較して提案されたアプローチの有効性が実証され、さまざまな PL でのコンパイルの成功率と機能の正確性が向上します。
コードは https://github.com/reddy-lab-code-research/PPOCoder にあります。

要約(オリジナル)

The utilization of programming language (PL) models, pretrained on large-scale code corpora, as a means of automating software engineering processes has demonstrated considerable potential in streamlining various code generation tasks such as code completion, code translation, and program synthesis. However, current approaches mainly rely on supervised fine-tuning objectives borrowed from text generation, neglecting specific sequence-level features of code, including but not limited to compilability as well as syntactic and functional correctness. To address this limitation, we propose PPOCoder, a new framework for code generation that combines pretrained PL models with Proximal Policy Optimization (PPO) deep reinforcement learning and employs execution feedback as the external source of knowledge into the model optimization. PPOCoder is transferable across different code generation tasks and PLs. Extensive experiments on three code generation tasks demonstrate the effectiveness of our proposed approach compared to SOTA methods, improving the success rate of compilation and functional correctness over different PLs. Our code can be found at https://github.com/reddy-lab-code-research/PPOCoder .

arxiv情報

著者	Parshin Shojaee,Aneesh Jain,Sindhu Tipirneni,Chandan K. Reddy
発行日	2023-02-13 20:43:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Execution-based Code Generation using Deep Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー