Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis

要約

任意スケールの画像合成は、2K 解像度を超えても、任意のスケールで写真のようにリアルな画像を合成するための効率的でスケーラブルなソリューションを提供します。
ただし、既存の GAN ベースのソリューションは畳み込みと階層アーキテクチャに過度に依存しており、出力解像度をスケーリングする際に矛盾と $“$texture sticking$’$ の問題が発生します。
別の観点から見ると、INR ベースのジェネレーターは設計上スケールが等しいですが、その巨大なメモリフットプリントと遅い推論が、これらのネットワークが大規模またはリアルタイムシステムに採用されるのを妨げています。
この作業では、$\textbf{C}$olumn-$\textbf{R}$ow $\textbf{E}$ntangled $\textbf{P}$ixel $\textbf{S}$ynthesis ($\
textbf{CREPS}$) は、空間畳み込みや粗から細への設計を使用せずに、効率的でスケールが等しい新しい生成モデルです。
メモリフットプリントを節約し、システムをスケーラブルにするために、レイヤーごとの特徴マップを個別の $“$thick$’$ 列および行エンコーディングに分解する新しいバイライン表現を採用しています。
FFHQ、LSUN-Church、MetFaces、Flickr-Scenery などのさまざまなデータセットでの実験により、適切なトレーニングと推論速度で任意の解像度でスケールの一貫したエイリアスのない画像を合成する CREPS の能力が確認されました。
コードは https://github.com/VinAIResearch/CREPS で入手できます。

要約(オリジナル)

Any-scale image synthesis offers an efficient and scalable solution to synthesize photo-realistic images at any scale, even going beyond 2K resolution. However, existing GAN-based solutions depend excessively on convolutions and a hierarchical architecture, which introduce inconsistency and the $“$texture sticking$’$ issue when scaling the output resolution. From another perspective, INR-based generators are scale-equivariant by design, but their huge memory footprint and slow inference hinder these networks from being adopted in large-scale or real-time systems. In this work, we propose $\textbf{C}$olumn-$\textbf{R}$ow $\textbf{E}$ntangled $\textbf{P}$ixel $\textbf{S}$ynthesis ($\textbf{CREPS}$), a new generative model that is both efficient and scale-equivariant without using any spatial convolutions or coarse-to-fine design. To save memory footprint and make the system scalable, we employ a novel bi-line representation that decomposes layer-wise feature maps into separate $“$thick$’$ column and row encodings. Experiments on various datasets, including FFHQ, LSUN-Church, MetFaces, and Flickr-Scenery, confirm CREPS’ ability to synthesize scale-consistent and alias-free images at any arbitrary resolution with proper training and inference speed. Code is available at https://github.com/VinAIResearch/CREPS.

arxiv情報

著者	Thuan Hoang Nguyen,Thanh Van Le,Anh Tran
発行日	2023-03-24 17:12:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー