Learning Deep Semantics for Test Completion

要約

テストの作成は、ソフトウェア開発において、時間がかかるが不可欠な作業である。我々は、テキストとコード生成のための深層学習の最近の進歩を活用して、開発者がテストを書くのを支援することを提案する。テスト補完という新しいタスクを定式化し、先行するステートメントとテスト対象のコードのコンテキストに基づいて、テストメソッドの次のステートメントを自動的に補完する。テスト補完のためのコードセマンティクスを用いた深層学習モデルであるTeCoを開発する。TeCoの基礎となる重要な洞察は、テストメソッドの次の文を予測するには、コードの実行に関する推論が必要であり、既存のコード補完モデルが使用する構文レベルのデータだけでは困難である、ということです。TeCoは、先行文の実行結果やテスト手法の実行コンテキストなど、6種類のコードセマンティクスデータを抽出・利用します。この新しいタスクのテストベッドを提供し、TeCoを評価するために、1,270のオープンソースJavaプロジェクトから130,934のテストメソッドのコーパスを収集する。その結果、TeCoは18の完全一致精度を達成し、構文レベルのデータのみを用いた最良のベースラインよりも29%高いことがわかった。また、生成されたnext文の機能的な正しさを測定すると、TeCoは29%のケースで実行可能なコードを生成できるのに対し、最適なベースラインでは18%であった。さらに、TeCoはテストオラクル生成に関する先行研究よりも大幅に優れている。

要約(オリジナル)

Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code generation to assist developers in writing tests. We formalize the novel task of test completion to automatically complete the next statement in a test method based on the context of prior statements and the code under test. We develop TeCo — a deep learning model using code semantics for test completion. The key insight underlying TeCo is that predicting the next statement in a test method requires reasoning about code execution, which is hard to do with only syntax-level data that existing code completion models use. TeCo extracts and uses six kinds of code semantics data, including the execution result of prior statements and the execution context of the test method. To provide a testbed for this new task, as well as to evaluate TeCo, we collect a corpus of 130,934 test methods from 1,270 open-source Java projects. Our results show that TeCo achieves an exact-match accuracy of 18, which is 29% higher than the best baseline using syntax-level data only. When measuring functional correctness of generated next statement, TeCo can generate runnable code in 29% of the cases compared to 18% obtained by the best baseline. Moreover, TeCo is significantly better than prior work on test oracle generation.

arxiv情報

著者	Pengyu Nie,Rahul Banerjee,Junyi Jessy Li,Raymond J. Mooney,Milos Gligoric
発行日	2023-03-04 16:57:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Learning Deep Semantics for Test Completion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー