「cs.CV」カテゴリーアーカイブ

DROID-Splat: Combining end-to-end SLAM with 3D Gaussian Splatting

投稿日: 2024年12月2日作成者: jarxiv

要約シーン合成における最近の進歩により、レンダリング目標を使用したハイパープリ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SIMS: Simulating Human-Scene Interactions with Real World Script Planning

投稿日: 2024年12月2日作成者: jarxiv

要約長期にわたるヒューマンシーンとシーンのインタラクションをシミュレートするこ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.GR | コメントを受け付けていません

On Domain-Specific Post-Training for Multimodal Large Language Models

投稿日: 2024年12月2日作成者: jarxiv

要約近年、一般的なマルチモーダル大規模言語モデル (MLLM) の急速な発展が … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds

投稿日: 2024年12月2日作成者: jarxiv

要約野生で何気なく撮影された単眼ビデオからダイナミックなシーンの斬新なビューを … 続きを読む →

カテゴリー: cs.CV, cs.GR | コメントを受け付けていません

VLSBench: Unveiling Visual Leakage in Multimodal Safety

投稿日: 2024年12月2日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) の安全性に関する懸念は、さま … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CR, cs.CV | コメントを受け付けていません

Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark

投稿日: 2024年12月2日作成者: jarxiv

要約 2023 年版の成功に続き、最先端のビデオモデルのベンチマークと測定を目 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Free-form Generation Enhances Challenging Clothed Human Modeling

投稿日: 2024年12月2日作成者: jarxiv

要約リアルなアニメーション人間アバターを実現するには、ポーズに依存する衣服の変 … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.LG | コメントを受け付けていません

Reanimating Images using Neural Representations of Dynamic Stimuli

投稿日: 2024年12月2日作成者: jarxiv

要約コンピュータービジョンモデルは、静的画像認識において驚くべき進歩を遂げ … 続きを読む →

カテゴリー: cs.AI, cs.CV, q-bio.NC | コメントを受け付けていません

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

投稿日: 2024年12月2日作成者: jarxiv

要約データセットの蒸留における最近の進歩により、2 つの主な方向での解決策が導 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

投稿日: 2024年12月2日作成者: jarxiv

要約 AlphaTablets は、連続的な 3D 表面と正確な境界描写を特徴と … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

DROID-Splat: Combining end-to-end SLAM with 3D Gaussian Splatting

SIMS: Simulating Human-Scene Interactions with Real World Script Planning

On Domain-Specific Post-Training for Multimodal Large Language Models

MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds

VLSBench: Unveiling Visual Leakage in Multimodal Safety

Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark

Free-form Generation Enhances Challenging Clothed Human Modeling

Reanimating Images using Neural Representations of Dynamic Stimuli

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

最近の投稿

最近のコメント

アーカイブ

カテゴリー