「cs.CV」カテゴリーアーカイブ

Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models

投稿日: 2024年12月24日作成者: jarxiv

要約 Foundation Vision Language Models (VL … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Editing Implicit and Explicit Representations of Radiance Fields: A Survey

投稿日: 2024年12月24日作成者: jarxiv

要約 Neural Radiance Fields (NeRF) は、コンパクト … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Detail-Preserving Latent Diffusion for Stable Shadow Removal

投稿日: 2024年12月24日作成者: jarxiv

要約複雑なグローバルイルミネーションのあるシーンでは、強力な汎用性を備えた高 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ANID: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance

投稿日: 2024年12月24日作成者: jarxiv

要約急速に進化する人工知能生成コンテンツ (AIGC) の分野における重要な課 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM | コメントを受け付けていません

Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering

投稿日: 2024年12月24日作成者: jarxiv

要約 Text-to-Image（TTI）生成モデルは目覚ましい成功を収めている … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding

投稿日: 2024年12月24日作成者: jarxiv

要約 3D シーンを理解するためにガウススプラッティングを知覚タスクに適用する … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SCBench: A Sports Commentary Benchmark for Video LLMs

投稿日: 2024年12月24日作成者: jarxiv

要約最近、学術界と産業界の両方でビデオ大規模言語モデル (ビデオ LLM) が … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Hierarchical Vector Quantization for Unsupervised Action Segmentation

投稿日: 2024年12月24日作成者: jarxiv

要約この研究では、教師なし時間アクションセグメンテーションに取り組みます。こ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder

投稿日: 2024年12月24日作成者: jarxiv

要約テキストまたは画像プロンプトから衣服を中心とした人間を生成するための拡散モ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions

投稿日: 2024年12月24日作成者: jarxiv

要約スパイキングニューラルネットワーク (SNN) は、時空間情報を処理で … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.NE | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models

Editing Implicit and Explicit Representations of Radiance Fields: A Survey

Detail-Preserving Latent Diffusion for Stable Shadow Removal

ANID: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance

Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering

LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding

SCBench: A Sports Commentary Benchmark for Video LLMs

Hierarchical Vector Quantization for Unsupervised Action Segmentation

DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder

Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions

最近の投稿

最近のコメント

アーカイブ

カテゴリー