The Casual Conversations v2 Dataset

要約

このホワイトペーパーでは、訓練を受けたアノテーターによって自己提供またはラベル付けされた 11 の属性に関して、コンピュータービジョンおよびオーディオ音声モデルのアルゴリズムの偏りと堅牢性の評価を支援することを目的とした、同意に基づく新しい大規模なデータセットを紹介します。
このデータセットには、ブラジル、インド、インドネシア、メキシコ、ベトナム、フィリピン、米国で記録された 5,567 人のユニークな有料参加者の 26,467 本のビデオが含まれており、1 人あたり平均 5 本のビデオが記録されており、多様な人口統計学的特徴を表しています。
参加者は、AI モデルの公平性を評価するためにデータを使用することに同意し、自己報告された年齢、性別、言語/方言、障害の状態、身体的装飾、身体的属性、および地理的位置情報を提供しました。
フィッツパトリックスキンタイプとモンクスキントーンスケール、および声の音色。
アノテーターは、さまざまな記録設定と 1 秒あたりのアクティビティアノテーションについてもラベル付けされています。

要約(オリジナル)

This paper introduces a new large consent-driven dataset aimed at assisting in the evaluation of algorithmic bias and robustness of computer vision and audio speech models in regards to 11 attributes that are self-provided or labeled by trained annotators. The dataset includes 26,467 videos of 5,567 unique paid participants, with an average of almost 5 videos per person, recorded in Brazil, India, Indonesia, Mexico, Vietnam, Philippines, and the USA, representing diverse demographic characteristics. The participants agreed for their data to be used in assessing fairness of AI models and provided self-reported age, gender, language/dialect, disability status, physical adornments, physical attributes and geo-location information, while trained annotators labeled apparent skin tone using the Fitzpatrick Skin Type and Monk Skin Tone scales, and voice timbre. Annotators also labeled for different recording setups and per-second activity annotations.

arxiv情報

著者	Bilal Porgali,Vítor Albiero,Jordan Ryda,Cristian Canton Ferrer,Caner Hazirbas
発行日	2023-03-08 19:17:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Casual Conversations v2 Dataset

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー