Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning

要約

機械学習（ML）モデルは通常、与えられたデータセットに対する精度について最適化される。しかし、この予測基準は、モデルのすべての望ましい特性、特に領域専門家のタスクの理解にどれだけ合致しているかを捉えることはほとんどない。アンダースペックとは、分布外（OOD）性能など他の望ましい特性は異なるものの、ドメイン内精度では区別がつかない複数のモデルが存在することを指します。このような状況を特定することは、MLモデルの信頼性を評価するために重要である。我々はアンダースペックの概念を定式化し、それを特定し部分的に対処する方法を提案する。我々は、複数のモデルに、異なる機能を実装させる独立性制約を与えて訓練する。その結果、標準的な経験的リスク最小化(ERM)では無視される予測特性を発見し、それを優れたOOD性能を持つグローバルモデルへと抽出する。重要なことは、モデルが意味のある特徴を発見することを確実にするために、データ多様体に整合するようにモデルを制約することである。我々は、コンピュータビジョンの複数のデータセット（コラージュ、WILDS-Camelyon17、GQA）上でこの方法を実証し、アンダースペック化の一般的な意味について議論する。最も注目すべきは、領域内の性能は、追加の仮定なしではOODモデル選択のために役立たないということである。

要約(オリジナル)

Machine learning (ML) models are typically optimized for their accuracy on a given dataset. However, this predictive criterion rarely captures all desirable properties of a model, in particular how well it matches a domain expert’s understanding of a task. Underspecification refers to the existence of multiple models that are indistinguishable in their in-domain accuracy, even though they differ in other desirable properties such as out-of-distribution (OOD) performance. Identifying these situations is critical for assessing the reliability of ML models. We formalize the concept of underspecification and propose a method to identify and partially address it. We train multiple models with an independence constraint that forces them to implement different functions. They discover predictive features that are otherwise ignored by standard empirical risk minimization (ERM), which we then distill into a global model with superior OOD performance. Importantly, we constrain the models to align with the data manifold to ensure that they discover meaningful features. We demonstrate the method on multiple datasets in computer vision (collages, WILDS-Camelyon17, GQA) and discuss general implications of underspecification. Most notably, in-domain performance cannot serve for OOD model selection without additional assumptions.

arxiv情報

著者	Damien Teney,Maxime Peyrard,Ehsan Abbasnejad
発行日	2022-07-06 11:20:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー