Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

要約

大規模言語モデル (LLM) は、Text-to-SQL タスクの新しいパラダイムとして登場しました。
ただし、体系的なベンチマークが存在しないため、効果的、効率的、経済的な LLM ベースの Text-to-SQL ソリューションの設計開発が妨げられます。
この課題に対処するために、この論文ではまず、質問表現、サンプルの選択、サンプルの構成など、既存のプロンプトエンジニアリング手法との体系的かつ広範な比較を行い、これらの実験結果を用いて、その長所と短所を詳しく説明します。
これらの発見に基づいて、私たちは、86.6% の実行精度で Spider リーダーボードを更新し、新たな基準を設定する、DAIL-SQL と呼ばれる新しい統合ソリューションを提案します。
効率的で経済的な LLM ベースの Text-to-SQL ソリューションに向けて、プロンプトエンジニアリングにおけるトークンの効率を重視し、この指標に基づいて先行研究を比較します。
さらに、コンテキスト内学習におけるオープンソース LLM を調査し、タスク固有の教師あり微調整によってそのパフォーマンスをさらに強化します。
私たちの調査では、Text-to-SQL におけるオープンソース LLM の可能性と、タスク固有の監視付き微調整の長所と短所を明らかにしています。
私たちの研究により、LLM を使用した Text-to-SQL についての理解が深まり、さらなる調査と幅広い応用が促進されることを願っています。

要約(オリジナル)

Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To address this challenge, in this paper, we first conduct a systematical and extensive comparison over existing prompt engineering methods, including question representation, example selection and example organization, and with these experimental results, we elaborates their pros and cons. Based on these findings, we propose a new integrated solution, named DAIL-SQL, which refreshes the Spider leaderboard with 86.6% execution accuracy and sets a new bar. Towards an efficient and economic LLM-based Text-to-SQL solution, we emphasize the token efficiency in prompt engineering and compare the prior studies under this metric. Additionally, we investigate open-source LLMs in in-context learning, and further enhance their performance with task-specific supervised fine-tuning. Our explorations highlight open-source LLMs’ potential in Text-to-SQL, as well as the advantages and disadvantages of the task-specific supervised fine-tuning. We hope that our work provides a deeper understanding of Text-to-SQL with LLMs, and inspire further investigations and broad applications.

arxiv情報

著者	Dawei Gao,Haibin Wang,Yaliang Li,Xiuyu Sun,Yichen Qian,Bolin Ding,Jingren Zhou
発行日	2023-08-29 14:59:54+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー