UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation

要約

対になっていない画像から画像への変換は、アート、デザイン、および科学シミュレーションに幅広い用途があります。
初期のブレークスルーの 1 つは、サイクル一貫性制約と組み合わせた生成的敵対的ネットワーク (GAN) を介した 2 つの対になっていない画像ドメイン間の 1 対 1 のマッピングを強調する CycleGAN でした。
翻訳された画像。
科学的シミュレーションと 1 対 1 のニーズに動機付けられたこの作業は、古典的な CycleGAN フレームワークを再検討し、そのパフォーマンスを向上させて、サイクルの一貫性の制約を緩和することなく、より現代的なモデルよりも優れたパフォーマンスを発揮します。
これを実現するために、ジェネレーターにビジョントランスフォーマー (ViT) を装備し、必要なトレーニングと正則化の手法を採用します。
以前の最高のパフォーマンスのモデルと比較して、私たちのモデルはより優れたパフォーマンスを発揮し、元の画像と翻訳された画像の間の強い相関関係を保持しています.
付随するアブレーション研究は、勾配ペナルティと自己教師付き事前トレーニングの両方が改善に不可欠であることを示しています。
再現性とオープンサイエンスを促進するために、ソースコード、ハイパーパラメーター構成、事前トレーニング済みモデルは https://github.com/LS4GAN/uvcgan で入手できます。

要約(オリジナル)

Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, while more recent works promote one-to-many mapping to boost diversity of the translated images. Motivated by scientific simulation and one-to-one needs, this work revisits the classic CycleGAN framework and boosts its performance to outperform more contemporary models without relaxing the cycle-consistency constraint. To achieve this, we equip the generator with a Vision Transformer (ViT) and employ necessary training and regularization techniques. Compared to previous best-performing models, our model performs better and retains a strong correlation between the original and translated image. An accompanying ablation study shows that both the gradient penalty and self-supervised pre-training are crucial to the improvement. To promote reproducibility and open science, the source code, hyperparameter configurations, and pre-trained model are available at https://github.com/LS4GAN/uvcgan.

arxiv情報

著者	Dmitrii Torbunov,Yi Huang,Haiwang Yu,Jin Huang,Shinjae Yoo,Meifeng Lin,Brett Viren,Yihui Ren
発行日	2022-10-18 13:39:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー