月別アーカイブ: 2024年3月

GauStudio: A Modular Framework for 3D Gaussian Splatting and Beyond

投稿日: 2024年3月29日作成者: jarxiv

要約 3D ガウススプラッティング (3DGS) をモデリングするための新しい … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Siamese Vision Transformers are Scalable Audio-visual Learners

投稿日: 2024年3月29日作成者: jarxiv

要約従来のオーディオビジュアル手法は、独立したオーディオとビジュアルのバックボ … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans

投稿日: 2024年3月29日作成者: jarxiv

要約我々は、容易に解釈可能な形状を持つ現実世界のシーンの大規模な 3D スキャ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models

投稿日: 2024年3月29日作成者: jarxiv

要約画像生成モデルの急速な進歩は主に拡散モデルによって推進されており、テキスト … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Direct Superpoints Matching for Robust Point Cloud Registration

投稿日: 2024年3月29日作成者: jarxiv

要約ディープニューラルネットワークは、ダウンサンプリングされたスーパーポイ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning

投稿日: 2024年3月29日作成者: jarxiv

要約地表の変化を監視することは、自然のプロセスと人間の影響を理解するために非常 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

GraspXL: Generating Grasping Motions for Diverse Objects at Scale

投稿日: 2024年3月29日作成者: jarxiv

要約人間の手は、物体の特定の部分を掴んだり、目的の方向から近づいたりするなど、 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models

投稿日: 2024年3月29日作成者: jarxiv

要約拡散モデルは画像生成には優れていますが、段階的にノイズ除去を行うため、生成 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

投稿日: 2024年3月29日作成者: jarxiv

要約画像検索、つまり参照画像から目的の画像を見つけることには、本質的に豊富で多 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.IR, cs.MM | コメントを受け付けていません

InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction

投稿日: 2024年3月29日作成者: jarxiv

要約テキスト条件付き人間モーション生成は、広範なモーションキャプチャデータ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年3月

GauStudio: A Modular Framework for 3D Gaussian Splatting and Beyond

Siamese Vision Transformers are Scalable Audio-visual Learners

Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans

GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models

Direct Superpoints Matching for Robust Point Cloud Registration

Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning

GraspXL: Generating Grasping Motions for Diverse Objects at Scale

ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction

最近の投稿

最近のコメント

アーカイブ

カテゴリー