「cs.CV」カテゴリーアーカイブ

ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images

投稿日: 2025年1月8日作成者: jarxiv

要約医療画像技術の進歩により、同じ患者を長期間にわたって繰り返しスキャンして疾 … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV | コメントを受け付けていません

NeuralSVG: An Implicit Representation for Text-to-Vector Generation

投稿日: 2025年1月8日作成者: jarxiv

要約ベクターグラフィックスはデザインに不可欠であり、解像度に依存せず、高度に … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance

投稿日: 2025年1月8日作成者: jarxiv

要約検索拡張生成 (RAG) は、外部知識を使用して応答生成をガイドすることで … 続きを読む →

カテゴリー: cs.CV, cs.IR, cs.IT, cs.LG, math.IT | コメントを受け付けていません

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

投稿日: 2025年1月8日作成者: jarxiv

要約この作品では、画像とビデオの両方をしっかりと根拠に基づいて理解するための初 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Extraction Of Cumulative Blobs From Dynamic Gestures

投稿日: 2025年1月8日作成者: jarxiv

要約ジェスチャ認識は、コンピューターが人間の動きをコマンドとして解釈できるよう … 続きを読む →

カテゴリー: 68T45, 68U10, cs.CV, H.5.2 | コメントを受け付けていません

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

投稿日: 2025年1月8日作成者: jarxiv

要約視覚言語モデル (VLM) の最近の進歩により、自動運転への使用、特に自然 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

投稿日: 2025年1月8日作成者: jarxiv

要約 LiDAR データの事前トレーニングは、大規模ですぐに利用できるデータセッ … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving

投稿日: 2025年1月8日作成者: jarxiv

要約ビジョン基盤モデル (VFM) の最近の進歩により、2D の視覚認識に革命 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control

投稿日: 2025年1月8日作成者: jarxiv

要約ビデオ生成は大幅に進歩しましたが、特定のオブジェクトをビデオに挿入すること … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild

投稿日: 2025年1月8日作成者: jarxiv

要約自然風景の画像内にビジュアルテキストを生成することは、多くの未解決の問題 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images

NeuralSVG: An Implicit Representation for Text-to-Vector Generation

RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Extraction Of Cumulative Blobs From Dynamic Gestures

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving

VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control

SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild

最近の投稿

最近のコメント

アーカイブ

カテゴリー