Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models

要約

スケーラブルベクターグラフィックス (SVG) は、デジタルデザインにおけるベクターグラフィックスの事実上の標準となっており、解像度に依存せず、個々の要素を正確に制御できます。
これらの利点にもかかわらず、高品質の SVG コンテンツを作成することは、プロの編集ソフトウェアを使用した技術的な専門知識と、複雑な形状を作成するためのかなりの時間の投資を必要とするため、依然として困難です。
最近のテキストから SVG への生成方法は、ベクトルグラフィックスの作成をよりアクセスしやすくすることを目的としていますが、形状の規則性、汎化能力、表現力には依然として限界があります。
これらの課題に対処するために、テキストから SVG への生成のための大規模言語モデル (LLM) と画像拡散モデルの長所を組み合わせたハイブリッドフレームワークである Chat2SVG を導入します。
私たちのアプローチでは、まず LLM を使用して、基本的な幾何学的プリミティブから意味的に意味のある SVG テンプレートを生成します。
画像拡散モデルに基づいて、デュアルステージの最適化パイプラインが潜在空間内のパスを洗練し、点の座標を調整して幾何学的複雑さを強化します。
広範な実験により、Chat2SVG は視覚的な忠実性、パスの規則性、およびセマンティックの整合性において既存の方法よりも優れていることが示されています。
さらに、当社のシステムは自然言語命令による直感的な編集を可能にし、すべてのユーザーがプロのベクターグラフィックス作成にアクセスできるようにします。

要約(オリジナル)

Scalable Vector Graphics (SVG) has become the de facto standard for vector graphics in digital design, offering resolution independence and precise control over individual elements. Despite their advantages, creating high-quality SVG content remains challenging, as it demands technical expertise with professional editing software and a considerable time investment to craft complex shapes. Recent text-to-SVG generation methods aim to make vector graphics creation more accessible, but they still encounter limitations in shape regularity, generalization ability, and expressiveness. To address these challenges, we introduce Chat2SVG, a hybrid framework that combines the strengths of Large Language Models (LLMs) and image diffusion models for text-to-SVG generation. Our approach first uses an LLM to generate semantically meaningful SVG templates from basic geometric primitives. Guided by image diffusion models, a dual-stage optimization pipeline refines paths in latent space and adjusts point coordinates to enhance geometric complexity. Extensive experiments show that Chat2SVG outperforms existing methods in visual fidelity, path regularity, and semantic alignment. Additionally, our system enables intuitive editing through natural language instructions, making professional vector graphics creation accessible to all users.

arxiv情報

著者	Ronghuan Wu,Wanchao Su,Jing Liao
発行日	2024-11-25 17:31:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー