DataPerf: Benchmarks for Data-Centric AI Development

要約

機械学習の研究は長い間、データセットではなくモデルに焦点を当ててきました。また、根底にある問題の範囲、難易度、忠実度に関係なく、著名なデータセットが一般的な ML タスクに使用されています。
データの基本的な重要性を無視すると、現実世界のアプリケーションにおいて不正確さ、偏り、脆弱性が生じ、既存のデータセットベンチマーク全体が飽和状態になることで研究が妨げられています。
これに応えて、ML データセットとデータ中心のアルゴリズムを評価するためのコミュニティ主導のベンチマークスイートである DataPerf を紹介します。
私たちは、競争、比較可能性、再現性を通じて、データ中心の AI におけるイノベーションを促進することを目指しています。
ML コミュニティがアーキテクチャだけでなくデータセットを反復できるようにし、この反復開発をサポートする複数ラウンドのチャレンジを備えたオープンなオンラインプラットフォームを提供します。
DataPerf の最初のイテレーションには、視覚、音声、取得、デバッグ、拡散プロンプトの幅広いデータ中心の技術、タスク、モダリティをカバーする 5 つのベンチマークが含まれており、コミュニティから提供された新しいベンチマークのホスティングをサポートしています。
ベンチマーク、オンライン評価プラットフォーム、ベースライン実装はオープンソースであり、MLCommons Association は学術界と産業界に長期的な利益をもたらすために DataPerf を維持します。

要約(オリジナル)

Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems. Neglecting the fundamental importance of data has given rise to inaccuracy, bias, and fragility in real-world applications, and research is hindered by saturation across existing dataset benchmarks. In response, we present DataPerf, a community-led benchmark suite for evaluating ML datasets and data-centric algorithms. We aim to foster innovation in data-centric AI through competition, comparability, and reproducibility. We enable the ML community to iterate on datasets, instead of just architectures, and we provide an open, online platform with multiple rounds of challenges to support this iterative development. The first iteration of DataPerf contains five benchmarks covering a wide spectrum of data-centric techniques, tasks, and modalities in vision, speech, acquisition, debugging, and diffusion prompting, and we support hosting new contributed benchmarks from the community. The benchmarks, online evaluation platform, and baseline implementations are open source, and the MLCommons Association will maintain DataPerf to ensure long-term benefits to academia and industry.

arxiv情報

著者	Mark Mazumder,Colby Banbury,Xiaozhe Yao,Bojan Karlaš,William Gaviria Rojas,Sudnya Diamos,Greg Diamos,Lynn He,Alicia Parrish,Hannah Rose Kirk,Jessica Quaye,Charvi Rastogi,Douwe Kiela,David Jurado,David Kanter,Rafael Mosquera,Juan Ciro,Lora Aroyo,Bilge Acun,Lingjiao Chen,Mehul Smriti Raje,Max Bartolo,Sabri Eyuboglu,Amirata Ghorbani,Emmett Goodman,Oana Inel,Tariq Kane,Christine R. Kirkpatrick,Tzu-Sheng Kuo,Jonas Mueller,Tristan Thrush,Joaquin Vanschoren,Margaret Warren,Adina Williams,Serena Yeung,Newsha Ardalani,Praveen Paritosh,Lilith Bat-Leah,Ce Zhang,James Zou,Carole-Jean Wu,Cody Coleman,Andrew Ng,Peter Mattson,Vijay Janapa Reddi
発行日	2023-10-13 15:24:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

DataPerf: Benchmarks for Data-Centric AI Development

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー