M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation

要約

リポジトリレベルのコード補完はソフトウェアエンジニアリングにおいて大きな注目を集めており、いくつかのベンチマークデータセットが導入されています。
ただし、既存のリポジトリレベルのコード補完ベンチマークは通常、限られた数の言語 (5 つ未満) に焦点を当てており、既存のコード大規模言語モデル (LLM) のさまざまな言語にわたる一般的なコードインテリジェンス能力を評価できません。
さらに、既存のベンチマークは通常、さまざまな言語の全体的な平均スコアを報告しますが、さまざまな完了シナリオにおけるきめ細かい能力は無視されます。
したがって、多言語シナリオでのコード LLM の研究を容易にするために、18 のプログラミング言語 (M2RC-EVAL と呼ばれる) と 2 種類のきめ細かいアノテーション (つまり、バケットレベルと
意味レベル) のさまざまな補完シナリオが提供されており、解析された抽象構文ツリーに基づいてこれらの注釈を取得します。
さらに、既存のコード LLM のリポジトリレベルのコード補完機能を向上させるために、大規模な多言語命令コーパス M2RC-INSTRUCT データセットも厳選しています。
包括的な実験結果により、M2RC-EVAL と M2RC-INSTRUCT の有効性が実証されています。

要約(オリジナル)

Repository-level code completion has drawn great attention in software engineering, and several benchmark datasets have been introduced. However, existing repository-level code completion benchmarks usually focus on a limited number of languages (<5), which cannot evaluate the general code intelligence abilities across different languages for existing code Large Language Models (LLMs). Besides, the existing benchmarks usually report overall average scores of different languages, where the fine-grained abilities in different completion scenarios are ignored. Therefore, to facilitate the research of code LLMs in multilingual scenarios, we propose a massively multilingual repository-level code completion benchmark covering 18 programming languages (called M2RC-EVAL), and two types of fine-grained annotations (i.e., bucket-level and semantic-level) on different completion scenarios are provided, where we obtain these annotations based on the parsed abstract syntax tree. Moreover, we also curate a massively multilingual instruction corpora M2RC- INSTRUCT dataset to improve the repository-level code completion abilities of existing code LLMs. Comprehensive experimental results demonstrate the effectiveness of our M2RC-EVAL and M2RC-INSTRUCT.

arxiv情報

著者	Jiaheng Liu,Ken Deng,Congnan Liu,Jian Yang,Shukai Liu,He Zhu,Peng Zhao,Linzheng Chai,Yanan Wu,Ke Jin,Ge Zhang,Zekun Wang,Guoan Zhang,Bangyu Xiang,Wenbo Su,Bo Zheng
発行日	2024-10-28 15:58:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー