Optimizing Case-Based Reasoning System for Functional Test Script Generation with Large Language Models

要約

この作業では、ターゲットソフトウェアの動的に進化するコード構造を理解する必要がある機能テストスクリプトを生成するための大規模な言語モデル（LLMS）の可能性を調査します。
これを達成するために、4Rサイクル（つまり、検索、再利用、修正、保持）を使用してケースベースの推論（CBR）システムを提案します。これは、テスト意図の説明と対応するテストスクリプトを維持およびレバレッジして、テストスクリプト生成のLLMを促進します。
ユーザーエクスペリエンスをさらに向上させるために、CBRシステムの最適化方法であるRE4を導入します。これは、再ランキングベースの検索微調整と再利用Finetuningを強化します。
具体的には、最初に、セマンティックとスクリプトの類似性が高い肯定的な例を特定し、コストのかかるラベル付けなしでレトリーバーモデルを微調整するための信頼できる擬似ラベルを提供します。
次に、監視された微調整を適用し、続いて補強材の微調整段階を使用して、LLMSを生産シナリオに合わせて、取得したケースの忠実な再利用を確保します。
Huawei Datacomの2つの製品開発ユニットに関する広範な実験結果は、提案されたCBR+RE4の優位性を示しています。
特に、提案されたRE4メソッドがLLMSの繰り返しの生成の問題を軽減するのに役立つことも示しています。

要約(オリジナル)

In this work, we explore the potential of large language models (LLMs) for generating functional test scripts, which necessitates understanding the dynamically evolving code structure of the target software. To achieve this, we propose a case-based reasoning (CBR) system utilizing a 4R cycle (i.e., retrieve, reuse, revise, and retain), which maintains and leverages a case bank of test intent descriptions and corresponding test scripts to facilitate LLMs for test script generation. To improve user experience further, we introduce Re4, an optimization method for the CBR system, comprising reranking-based retrieval finetuning and reinforced reuse finetuning. Specifically, we first identify positive examples with high semantic and script similarity, providing reliable pseudo-labels for finetuning the retriever model without costly labeling. Then, we apply supervised finetuning, followed by a reinforcement learning finetuning stage, to align LLMs with our production scenarios, ensuring the faithful reuse of retrieved cases. Extensive experimental results on two product development units from Huawei Datacom demonstrate the superiority of the proposed CBR+Re4. Notably, we also show that the proposed Re4 method can help alleviate the repetitive generation issues with LLMs.

arxiv情報

著者	Siyuan Guo,Huiwu Liu,Xiaolong Chen,Yuming Xie,Liang Zhang,Tao Han,Hechang Chen,Yi Chang,Jun Wang
発行日	2025-03-26 14:23:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Optimizing Case-Based Reasoning System for Functional Test Script Generation with Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー