EasyPortrait — Face Parsing and Portrait Segmentation Dataset

要約

最近、ビデオ会議アプリは、リアルタイムの背景除去や顔の美化などのコンピュータービジョンベースの機能を実現することで機能的になりました。
頭のポーズ、民族性、シーン、ビデオ会議に特有のオクルージョンなど、既存のポートレートセグメンテーションと顔解析データセットの変動が限られていたため、これらのタスクを同時に行うための新しいデータセット EasyPortrait を作成する動機になりました。
これには、13,705 人のユニークユーザーとのビデオ会議シナリオを繰り返す主に屋内の写真 40,000 枚と、9 つのクラスに分類されたきめ細かいセグメンテーションマスクが含まれています。
他のデータセットからの不適切なアノテーションマスクにより、アノテーターガイドラインの改訂が行われ、その結果、EasyPortrait で歯のホワイトニングや肌のスムージングなどのケースを処理できるようになりました。
この論文では、クラウドソーシングを介したデータマイニングと高品質のマスクアノテーションのためのパイプラインも提案されています。
アブレーションスタディの実験では、モデルを効果的に学習するには、データセット内のデータ量と頭部ポーズの多様性が重要であることが証明されました。
クロスデータセット評価実験により、ポートレートセグメンテーションデータセット間で最高のドメイン汎化能力が確認されました。
さらに、追加のトレーニングトリックを必要とせずに、EasyPortrait でセグメンテーションモデルをトレーニングすることが簡単であることを示します。
提案されたデータセットとトレーニングされたモデルは公開されています。

要約(オリジナル)

Recently, video conferencing apps have become functional by accomplishing such computer vision-based features as real-time background removal and face beautification. Limited variability in existing portrait segmentation and face parsing datasets, including head poses, ethnicity, scenes, and occlusions specific to video conferencing, motivated us to create a new dataset, EasyPortrait, for these tasks simultaneously. It contains 40,000 primarily indoor photos repeating video meeting scenarios with 13,705 unique users and fine-grained segmentation masks separated into 9 classes. Inappropriate annotation masks from other datasets caused a revision of annotator guidelines, resulting in EasyPortrait’s ability to process cases, such as teeth whitening and skin smoothing. The pipeline for data mining and high-quality mask annotation via crowdsourcing is also proposed in this paper. In the ablation study experiments, we proved the importance of data quantity and diversity in head poses in our dataset for the effective learning of the model. The cross-dataset evaluation experiments confirmed the best domain generalization ability among portrait segmentation datasets. Moreover, we demonstrate the simplicity of training segmentation models on EasyPortrait without extra training tricks. The proposed dataset and trained models are publicly available.

arxiv情報

著者	Karina Kvanchiani,Elizaveta Petrova,Karen Efremyan,Alexander Sautin,Alexander Kapitanov
発行日	2024-03-07 15:34:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

EasyPortrait — Face Parsing and Portrait Segmentation Dataset

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー