AQuA: A Benchmarking Tool for Label Quality Assessment

要約

機械学習 (ML) モデルの品質は、トレーニングに使用されたデータによって決まります。
しかし、最近の研究では、ML モデルのトレーニングと評価に広く使用されているデータセットが判明しました。
ImageNet では、広範囲にわたるラベル付けエラーが発生します。
トレーニングセットのラベルが間違っていると、ML モデルの一般化能力が損なわれ、テストセットを使用した評価とモデルの選択に影響します。
したがって、ラベル付けエラーの存在下での学習は活発な研究分野ですが、この分野にはこれらの方法を評価するための包括的なベンチマークがありません。
これらの手法のほとんどは、実験プロトコルに大きな差異があるいくつかのコンピュータービジョンデータセットで評価されています。
これほど多くの手法と一貫性のない評価が存在するため、ML の実践者がデータのラベルの品質を評価するために適切なモデルをどのように選択できるのかも不明です。
この目的を達成するために、ラベルノイズの存在下で機械学習を可能にする手法を厳密に評価するベンチマーク環境 AQuA を提案します。
また、ラベルエラー検出モデルの具体的な設計選択肢を説明するための設計空間も紹介します。
私たちは、私たちが提案した設計空間とベンチマークによって、実務者がラベルの品質を向上させるための適切なツールを選択できるようになり、私たちのベンチマークによって、不正なラベルが付けられたデータに直面する機械学習ツールの客観的かつ厳密な評価が可能になることを願っています。

要約(オリジナル)

Machine learning (ML) models are only as good as the data they are trained on. But recent studies have found datasets widely used to train and evaluate ML models, e.g. ImageNet, to have pervasive labeling errors. Erroneous labels on the train set hurt ML models’ ability to generalize, and they impact evaluation and model selection using the test set. Consequently, learning in the presence of labeling errors is an active area of research, yet this field lacks a comprehensive benchmark to evaluate these methods. Most of these methods are evaluated on a few computer vision datasets with significant variance in the experimental protocols. With such a large pool of methods and inconsistent evaluation, it is also unclear how ML practitioners can choose the right models to assess label quality in their data. To this end, we propose a benchmarking environment AQuA to rigorously evaluate methods that enable machine learning in the presence of label noise. We also introduce a design space to delineate concrete design choices of label error detection models. We hope that our proposed design space and benchmark enable practitioners to choose the right tools to improve their label quality and that our benchmark enables objective and rigorous evaluation of machine learning tools facing mislabeled data.

arxiv情報

著者	Mononito Goswami,Vedant Sanil,Arjun Choudhry,Arvind Srinivasan,Chalisa Udompanyawit,Artur Dubrawski
発行日	2024-01-16 11:10:28+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

AQuA: A Benchmarking Tool for Label Quality Assessment

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー