Text-to-Vector Generation with Neural Path Representation

要約

ベクターグラフィックスはデジタルアートで広く使用されており、そのスケーラビリティとレイヤーごとの特性によりデザイナーに非常に好まれています。
ただし、ベクターグラフィックの作成と編集のプロセスには創造性とデザインの専門知識が必要であり、時間のかかる作業になります。
テキストからベクター (T2V) 生成における最近の進歩は、このプロセスをよりアクセスしやすくすることを目的としています。
ただし、既存の T2V 手法は、ベクトルグラフィックスパスの制御点を直接最適化するため、ジオメトリ制約がないためにパスが交差したりギザギザになったりすることがよくあります。
これらの制限を克服するために、シーケンスと画像モダリティの両方からパス潜在空間を学習するデュアルブランチ変分オートエンコーダー (VAE) を設計することにより、新しいニューラルパス表現を提案します。
ニューラルパスの組み合わせを最適化することで、生成される SVG の表現力を維持しながら幾何学的制約を組み込むことができます。
さらに、生成された SVG の視覚的およびトポロジー的な品質を向上させるために、2 段階のパス最適化手法を導入します。
第 1 段階では、事前トレーニングされたテキストから画像への拡散モデルが、変分スコア蒸留 (VSD) プロセスを通じて複雑なベクターグラフィックスの初期生成をガイドします。
第 2 段階では、レイヤーごとの画像ベクトル化戦略を使用してグラフィックスを洗練し、より明確な要素と構造を実現します。
私たちは広範な実験を通じて私たちの方法の有効性を実証し、さまざまなアプリケーションを紹介します。
プロジェクトページは https://intchous.github.io/T2V-NPR です。

要約(オリジナル)

Vector graphics are widely used in digital art and highly favored by designers due to their scalability and layer-wise properties. However, the process of creating and editing vector graphics requires creativity and design expertise, making it a time-consuming task. Recent advancements in text-to-vector (T2V) generation have aimed to make this process more accessible. However, existing T2V methods directly optimize control points of vector graphics paths, often resulting in intersecting or jagged paths due to the lack of geometry constraints. To overcome these limitations, we propose a novel neural path representation by designing a dual-branch Variational Autoencoder (VAE) that learns the path latent space from both sequence and image modalities. By optimizing the combination of neural paths, we can incorporate geometric constraints while preserving expressivity in generated SVGs. Furthermore, we introduce a two-stage path optimization method to improve the visual and topological quality of generated SVGs. In the first stage, a pre-trained text-to-image diffusion model guides the initial generation of complex vector graphics through the Variational Score Distillation (VSD) process. In the second stage, we refine the graphics using a layer-wise image vectorization strategy to achieve clearer elements and structure. We demonstrate the effectiveness of our method through extensive experiments and showcase various applications. The project page is https://intchous.github.io/T2V-NPR.

arxiv情報

著者	Peiying Zhang,Nanxuan Zhao,Jing Liao
発行日	2024-05-16 17:59:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Text-to-Vector Generation with Neural Path Representation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー