CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM

要約

このペーパーは、テキスト記述、画像、点群、またはそれらの組み合わせの形式でのユーザーの入力に基づいて CAD モデルを簡単に生成できる、統合されたコンピュータ支援設計 (CAD) 生成システムを設計することを目的としています。
この目標に向けて、マルチモーダル入力を条件としたパラメトリック CAD モデルを生成できる最初のシステムである CAD-MLLM を導入します。
具体的には、CAD-MLLM フレームワーク内で、CAD モデルのコマンドシーケンスを活用し、高度なラージ言語モデル (LLM) を使用して、これらの多様なマルチモダリティデータと CAD モデルのベクトル化された表現にわたって特徴空間を調整します。
モデルのトレーニングを容易にするために、各 CAD モデルに対応するマルチモーダルデータを装備する包括的なデータ構築および注釈パイプラインを設計します。
Omni-CAD と名付けられたその結果のデータセットは、各 CAD モデルのテキストによる説明、マルチビュー画像、ポイント、およびコマンドシーケンスを含む初のマルチモーダル CAD データセットです。
これには、約 450,000 のインスタンスとその CAD 構築シーケンスが含まれています。
生成された CAD モデルの品質を徹底的に評価するために、トポロジの品質と表面エンクロージャの範囲を評価する追加の指標を導入することで、再構築の品質に焦点を当てた現在の評価指標を超えています。
広範な実験結果は、CAD-MLLM が既存の条件付き生成手法を大幅に上回り、ノイズや欠落点に対して非常に堅牢であることを示しています。
プロジェクトページとその他のビジュアライゼーションは、https://cad-mllm.github.io/ で見つけることができます。

要約(オリジナル)

This paper aims to design a unified Computer-Aided Design (CAD) generation system that can easily generate CAD models based on the user’s inputs in the form of textual description, images, point clouds, or even a combination of them. Towards this goal, we introduce the CAD-MLLM, the first system capable of generating parametric CAD models conditioned on the multimodal input. Specifically, within the CAD-MLLM framework, we leverage the command sequences of CAD models and then employ advanced large language models (LLMs) to align the feature space across these diverse multi-modalities data and CAD models’ vectorized representations. To facilitate the model training, we design a comprehensive data construction and annotation pipeline that equips each CAD model with corresponding multimodal data. Our resulting dataset, named Omni-CAD, is the first multimodal CAD dataset that contains textual description, multi-view images, points, and command sequence for each CAD model. It contains approximately 450K instances and their CAD construction sequences. To thoroughly evaluate the quality of our generated CAD models, we go beyond current evaluation metrics that focus on reconstruction quality by introducing additional metrics that assess topology quality and surface enclosure extent. Extensive experimental results demonstrate that CAD-MLLM significantly outperforms existing conditional generative methods and remains highly robust to noises and missing points. The project page and more visualizations can be found at: https://cad-mllm.github.io/

arxiv情報

著者	Jingwei Xu,Chenyu Wang,Zibo Zhao,Wen Liu,Yi Ma,Shenghua Gao
発行日	2024-11-07 18:31:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー