「cs.CV」カテゴリーアーカイブ

InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models

投稿日: 2024年12月19日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) によって後押しされ、画像およ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

投稿日: 2024年12月19日作成者: jarxiv

要約プロンプトは、特定のタスクに対して言語と視覚の基礎モデルの力を解き放つ上で … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation

投稿日: 2024年12月19日作成者: jarxiv

要約医療ビデオの生成には、正確で制御可能な視覚表現を通じて、外科の理解と病理の … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM, cs.RO | コメントを受け付けていません

Restore Anything Model via Efficient Degradation Adaptation

投稿日: 2024年12月19日作成者: jarxiv

要約モバイルデバイスの普及に伴い、劣化した画像を復元するための効率的なモデル … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CAD-Recode: Reverse Engineering CAD Code from Point Clouds

投稿日: 2024年12月19日作成者: jarxiv

要約コンピュータ支援設計 (CAD) モデルは通常、パラメトリックスケッチを … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future

投稿日: 2024年12月19日作成者: jarxiv

要約人工知能 (AI) は、計算能力の進歩と大規模なデータセットの増加によって … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.MM | コメントを受け付けていません

Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models

投稿日: 2024年12月19日作成者: jarxiv

要約 Foundation Vision Language Models (VL … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Joint Perception and Prediction for Autonomous Driving: A Survey

投稿日: 2024年12月19日作成者: jarxiv

要約知覚および予測モジュールは自動運転システムの重要なコンポーネントであり、車 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts

投稿日: 2024年12月19日作成者: jarxiv

要約基礎モデル (FM) の進歩により、機械学習のパラダイムシフトが起こりまし … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset

投稿日: 2024年12月19日作成者: jarxiv

要約深層学習手法を使用して脳腫瘍のセグメンテーションを自動化することは、医療画 … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation

Restore Anything Model via Efficient Degradation Adaptation

CAD-Recode: Reverse Engineering CAD Code from Point Clouds

A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future

Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models

Joint Perception and Prediction for Autonomous Driving: A Survey

Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts

Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset

最近の投稿

最近のコメント

アーカイブ

カテゴリー