Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases

要約

偽のキューへの依存によって引き起こされるモデルのバイアスを測定および軽減するための、シンプルだが効果的な方法を紹介します。
データやモデルのトレーニングにコストのかかる変更を必要とする代わりに、私たちの方法では、データを並べ替えることで既存のデータをより効果的に利用できます。
具体的には、解釈可能なネットワークの深いニューラル機能を介して、スプリオスシティ (一般的なスプリアスキューが存在する程度) に基づいて、クラス内の画像をランク付けします。
スプリオシティランキングを使用すると、少数派の部分母集団 (つまり、スプリオシティが低い画像) を特定し、スプリオシティが高い画像と低い画像の間の精度のギャップとしてモデルのバイアスを評価することが簡単になります。
スプリオス性の低い画像で分類ヘッドを微調整することで、精度をほとんど犠牲にせずにモデルのバイアスを効率的に除去することもでき、その結果、スプリオス性に関係なくサンプルをより公平に扱うことができます。
ImageNet 上でメソッドをデモンストレーションし、$5000$ のクラス特徴の依存関係 (うち $630$ は偽であることが判明) に注釈を付け、途中でこれらの特徴に対して $325k$ のソフトセグメンテーションのデータセットを生成します。
特定された偽の神経特徴を介して偽性ランキングを計算した後、$89$ の多様なモデルのバイアスを評価し、クラスごとのバイアスがモデル間で高度に相関していることがわかりました。
私たちの結果は、偽の特徴への依存によるモデルのバイアスが、モデルがどのようにトレーニングされるかよりも、何に基づいてトレーニングされるかによってはるかに影響されることを示唆しています。

要約(オリジナル)

We present a simple but effective method to measure and mitigate model biases caused by reliance on spurious cues. Instead of requiring costly changes to one’s data or model training, our method better utilizes the data one already has by sorting them. Specifically, we rank images within their classes based on spuriosity (the degree to which common spurious cues are present), proxied via deep neural features of an interpretable network. With spuriosity rankings, it is easy to identify minority subpopulations (i.e. low spuriosity images) and assess model bias as the gap in accuracy between high and low spuriosity images. One can even efficiently remove a model’s bias at little cost to accuracy by finetuning its classification head on low spuriosity images, resulting in fairer treatment of samples regardless of spuriosity. We demonstrate our method on ImageNet, annotating $5000$ class-feature dependencies ($630$ of which we find to be spurious) and generating a dataset of $325k$ soft segmentations for these features along the way. Having computed spuriosity rankings via the identified spurious neural features, we assess biases for $89$ diverse models and find that class-wise biases are highly correlated across models. Our results suggest that model bias due to spurious feature reliance is influenced far more by what the model is trained on than how it is trained.

arxiv情報

著者	Mazda Moayeri,Wenxiao Wang,Sahil Singla,Soheil Feizi
発行日	2023-10-05 17:59:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー