Exploring Distributional Shifts in Large Language Models for Code Analysis

要約

コードの 2 つの大きな言語モデル (CodeT5 と Codex) の容量を体系的に研究して、ドメイン外のデータに一般化します。
この研究では、コード要約とコード生成という 2 つの基本的なアプリケーションを検討します。
データは、その自然な境界 (組織、プロジェクト、およびソフトウェアプロジェクト内のモジュール) に従ってドメインに分割されます。
これにより、展開時のドメイン内データとドメイン外データの認識が簡単になります。
それぞれの新しいドメインからのサンプルが、分布シフトの重大な課題を伴う両方のモデルを提示することを確立します。
さまざまな確立された方法がモデルを適応させて、新しいドメインによりよく一般化できるかを研究します。
私たちの実験では、マルチタスク学習だけでも妥当なベースラインですが、それをトレーニングデータから取得したサンプルの数回の微調整と組み合わせることで、非常に強力なパフォーマンスを達成できることが示されています。
実際、私たちの実験によると、このソリューションは、データ量が非常に少ないシナリオでは、直接微調整を行うよりも優れたパフォーマンスを発揮します。
最後に、このアプローチのバリエーションを検討して、一度に複数のドメインに適応するためのより広く適用可能な方法を作成します。
コード生成の場合、複数のドメインに同時に適応したモデルは、各ドメインに個別に適応したモデルと同等のパフォーマンスを発揮することがわかりました。

要約(オリジナル)

We systematically study the capacity of two large language models for code – CodeT5 and Codex – to generalize to out-of-domain data. In this study, we consider two fundamental applications – code summarization, and code generation. We split data into domains following its natural boundaries – by an organization, by a project, and by a module within the software project. This makes recognition of in-domain vs out-of-domain data at the time of deployment trivial. We establish that samples from each new domain present both models with a significant challenge of distribution shift. We study how well different established methods can adapt models to better generalize to new domains. Our experiments show that while multitask learning alone is a reasonable baseline, combining it with few-shot finetuning on examples retrieved from training data can achieve very strong performance. In fact, according to our experiments, this solution can outperform direct finetuning for very low-data scenarios. Finally, we consider variations of this approach to create a more broadly applicable method to adapt to multiple domains at once. We find that in the case of code generation, a model adapted to multiple domains simultaneously performs on par with those adapted to each domain individually.

arxiv情報

著者	Shushan Arakelyan,Rocktim Jyoti Das,Yi Mao,Xiang Ren
発行日	2023-03-16 07:45:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Exploring Distributional Shifts in Large Language Models for Code Analysis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー