OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes

要約

テキストからイメージ（T2I）モデルによって生成された画像は、しばしば文化や職業などの概念の視覚的バイアスとステレオタイプを示します。
ステレオタイプの既存の定量的尺度は、ステレオタイプの社会学的定義と一致しない統計的平等に基づいており、したがって、バイアスをステレオタイプとして分類します。
ステレオタイプをバイアスとして単純化する代わりに、その社会学的定義と一致するステレオタイプの定量的な尺度を提案します。
次に、OASISを提案して、生成されたデータセットでステレオタイプを測定し、T2Iモデル内のそれらの起源を理解します。
OASISには、生成された画像データセットからステレオタイプを測定するための2つのスコアが含まれています。（M1）ステレオタイプの属性の分布違反を測定するステレオタイプスコア、および（M2）WALは、ステレオタイプの属性に沿った画像のスペクトル分散を測定します。
OASISには、T2Iモデルのステレオタイプの起源を理解するための2つの方法も含まれています。（U1）T2Iモデルが特定の概念と内部的に関連する属性を発見し、（U2）SPIが画像生成中のT2Iモデルの潜在空間におけるステレオタイプの属性の出現を定量化する。
OASISを使用して、画像の忠実度のかなりの進歩にもかかわらず、Flux.1やSDV3などの新しいT2Iモデルには、概念に関する強力なステレオタイプの素因が含まれており、広範囲にわたるステレオタイプの属性を持つ画像を生成すると結論付けています。
さらに、ステレオタイプの量は、インターネットのフットプリントが低い国籍のために悪化します。

要約(オリジナル)

Images generated by text-to-image (T2I) models often exhibit visual biases and stereotypes of concepts such as culture and profession. Existing quantitative measures of stereotypes are based on statistical parity that does not align with the sociological definition of stereotypes and, therefore, incorrectly categorizes biases as stereotypes. Instead of oversimplifying stereotypes as biases, we propose a quantitative measure of stereotypes that aligns with its sociological definition. We then propose OASIS to measure the stereotypes in a generated dataset and understand their origins within the T2I model. OASIS includes two scores to measure stereotypes from a generated image dataset: (M1) Stereotype Score to measure the distributional violation of stereotypical attributes, and (M2) WALS to measure spectral variance in the images along a stereotypical attribute. OASIS also includes two methods to understand the origins of stereotypes in T2I models: (U1) StOP to discover attributes that the T2I model internally associates with a given concept, and (U2) SPI to quantify the emergence of stereotypical attributes in the latent space of the T2I model during image generation. Despite the considerable progress in image fidelity, using OASIS, we conclude that newer T2I models such as FLUX.1 and SDv3 contain strong stereotypical predispositions about concepts and still generate images with widespread stereotypical attributes. Additionally, the quantity of stereotypes worsens for nationalities with lower Internet footprints.

arxiv情報

著者	Sepehr Dehdashtian,Gautam Sreekumar,Vishnu Naresh Boddeti
発行日	2025-02-26 18:04:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー