Breaking the Silence: the Threats of Using LLMs in Software Engineering

要約

大規模言語モデル (LLM) はソフトウェアエンジニアリング (SE) コミュニティ内で大きな注目を集めており、コード補完からテスト生成、プログラム修復からコード要約に至るまで、さまざまな SE タスクに影響を与えています。
その約束にもかかわらず、多数の複雑な要因が LLM を含む実験の結果に影響を与える可能性があるため、研究者は依然として注意を払う必要があります。
この論文は、クローズドソースモデル、LLM トレーニングデータと研究評価の間のデータ漏洩の可能性、LLM ベースの調査結果の再現性などの問題を含む、LLM ベースの研究の有効性に対する潜在的な脅威に関する公開ディスカッションを開始します。
これに応えて、この文書では、これらの懸念を軽減するために、SE 研究者と言語モデル (LM) プロバイダー向けに調整された一連のガイドラインを提案します。
ガイドラインの意味は、LLM プロバイダーが従う既存の優れた実践例と、テストケース生成のコンテキストにおける SE 研究者向けの実践例を使用して説明されています。

要約(オリジナル)

Large Language Models (LLMs) have gained considerable traction within the Software Engineering (SE) community, impacting various SE tasks from code completion to test generation, from program repair to code summarization. Despite their promise, researchers must still be careful as numerous intricate factors can influence the outcomes of experiments involving LLMs. This paper initiates an open discussion on potential threats to the validity of LLM-based research including issues such as closed-source models, possible data leakage between LLM training data and research evaluation, and the reproducibility of LLM-based findings. In response, this paper proposes a set of guidelines tailored for SE researchers and Language Model (LM) providers to mitigate these concerns. The implications of the guidelines are illustrated using existing good practices followed by LLM providers and a practical example for SE researchers in the context of test case generation.

arxiv情報

著者	June Sallou,Thomas Durieux,Annibale Panichella
発行日	2024-01-08 14:30:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Breaking the Silence: the Threats of Using LLMs in Software Engineering

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー