A Declarative System for Optimizing AI Workloads

要約

データ管理システムの長年の目標は、コスト効率の高い方法で、大量の非構造化データに対して定量的な洞察を計算できるシステムを構築することです。
最近まで、企業文書から事実を抽出したり、科学論文からデータを抽出したり、画像やビデオのコーパスから指標を抽出したりするのは難しく、コストがかかりました。
現在のモデルは、これらのタスクを高精度で実行できます。
ただし、AI を活用した実質的なクエリに答えたいプログラマーは、多数のモデル、プロンプト、データ操作を調整する必要があります。
単一のクエリに対しても、プログラマはモデルの選択、適切な推論方法、最もコスト効率の高い推論ハードウェア、理想的なプロンプト設計など、膨大な数の決定を下す必要があります。
最適な一連の決定は、クエリの変化や急速に進化する技術情勢の変化に応じて変化する可能性があります。
このペーパーでは、宣言型言語で定義するだけで、誰でも AI を活用した分析クエリを処理できるシステムである Palimpzest を紹介します。
システムはコスト最適化フレームワークを使用して、実行時間、財務コスト、出力データ品質の間で最適なトレードオフを実現するクエリプランを実装します。
AI を活用した分析タスクのワークロード、Palimpzest が使用する最適化手法、およびプロトタイプシステム自体について説明します。
当社では、法的証拠開示、不動産調査、医療スキーママッチングのタスクに関して Palimpzest を評価しています。
シンプルなプロトタイプでも、ベースライン方式と比べて 3.3 倍高速で 2.9 倍安価でありながら、より優れたデータ品質を提供するなど、さまざまな魅力的なプランを提供していることを示しています。
並列処理を有効にすると、Palimpzest は、シングルスレッド GPT-4 ベースラインと比較して 9.1 倍低いコストで最大 90.3 倍高速化したプランを作成でき、同時にベースラインの 83.5% 以内の F1 スコアを取得できます。
これらには、ユーザーによる追加の作業は必要ありません。

要約(オリジナル)

A long-standing goal of data management systems has been to build systems which can compute quantitative insights over large corpora of unstructured data in a cost-effective manner. Until recently, it was difficult and expensive to extract facts from company documents, data from scientific papers, or metrics from image and video corpora. Today’s models can accomplish these tasks with high accuracy. However, a programmer who wants to answer a substantive AI-powered query must orchestrate large numbers of models, prompts, and data operations. For even a single query, the programmer has to make a vast number of decisions such as the choice of model, the right inference method, the most cost-effective inference hardware, the ideal prompt design, and so on. The optimal set of decisions can change as the query changes and as the rapidly-evolving technical landscape shifts. In this paper we present Palimpzest, a system that enables anyone to process AI-powered analytical queries simply by defining them in a declarative language. The system uses its cost optimization framework to implement the query plan with the best trade-offs between runtime, financial cost, and output data quality. We describe the workload of AI-powered analytics tasks, the optimization methods that Palimpzest uses, and the prototype system itself. We evaluate Palimpzest on tasks in Legal Discovery, Real Estate Search, and Medical Schema Matching. We show that even our simple prototype offers a range of appealing plans, including one that is 3.3x faster and 2.9x cheaper than the baseline method, while also offering better data quality. With parallelism enabled, Palimpzest can produce plans with up to a 90.3x speedup at 9.1x lower cost relative to a single-threaded GPT-4 baseline, while obtaining an F1-score within 83.5% of the baseline. These require no additional work by the user.

arxiv情報

著者	Chunwei Liu,Matthew Russo,Michael Cafarella,Lei Cao,Peter Baille Chen,Zui Chen,Michael Franklin,Tim Kraska,Samuel Madden,Gerardo Vitagliano
発行日	2024-05-29 15:27:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Declarative System for Optimizing AI Workloads

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー