Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

要約

大規模言語モデル (LLM) は、Text-to-SQL タスクの新しいパラダイムとして登場しました。
ただし、体系的なベンチマークが存在しないため、効果的、効率的、経済的な LLM ベースの Text-to-SQL ソリューションの設計開発が妨げられます。
この課題に対処するために、この論文ではまず、質問表現、サンプルの選択、サンプルの構成など、既存のプロンプトエンジニアリング手法との体系的かつ広範な比較を行い、これらの実験結果を用いて、その長所と短所を詳しく説明します。
これらの発見に基づいて、私たちは、86.6% の実行精度で Spider リーダーボードを更新し、新たな基準を設定する、DAIL-SQL と呼ばれる新しい統合ソリューションを提案します。
オープンソース LLM の可能性を探るため、さまざまなシナリオで LLM を調査し、監視付き微調整によってパフォーマンスをさらに強化します。
私たちの調査では、Text-to-SQL におけるオープンソース LLM の可能性と、教師付き微調整の長所と短所を明らかにしています。
さらに、効率的で経済的な LLM ベースの Text-to-SQL ソリューションに向けて、プロンプトエンジニアリングにおけるトークンの効率を重視し、この指標に基づいて先行研究を比較します。
私たちの研究により、LLM を使用した Text-to-SQL についての理解が深まり、さらなる調査と幅広い応用が促進されることを願っています。

要約(オリジナル)

Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To address this challenge, in this paper, we first conduct a systematical and extensive comparison over existing prompt engineering methods, including question representation, example selection and example organization, and with these experimental results, we elaborate their pros and cons. Based on these findings, we propose a new integrated solution, named DAIL-SQL, which refreshes the Spider leaderboard with 86.6% execution accuracy and sets a new bar. To explore the potential of open-source LLM, we investigate them in various scenarios, and further enhance their performance with supervised fine-tuning. Our explorations highlight open-source LLMs’ potential in Text-to-SQL, as well as the advantages and disadvantages of the supervised fine-tuning. Additionally, towards an efficient and economic LLM-based Text-to-SQL solution, we emphasize the token efficiency in prompt engineering and compare the prior studies under this metric. We hope that our work provides a deeper understanding of Text-to-SQL with LLMs, and inspires further investigations and broad applications.

arxiv情報

著者	Dawei Gao,Haibin Wang,Yaliang Li,Xiuyu Sun,Yichen Qian,Bolin Ding,Jingren Zhou
発行日	2023-09-08 10:13:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー