「cs.CV」カテゴリーアーカイブ

Sim2Real in endoscopy segmentation with a novel structure aware image translation

投稿日: 2025年5月6日作成者: jarxiv

要約内視鏡画像における解剖学的ランドマークの自動セグメンテーションは、医師や外 … 続きを読む →

カテゴリー: cs.CV, I.2.10 | コメントを受け付けていません

Grasp the Graph (GtG) 2.0: Ensemble of GNNs for High-Precision Grasp Pose Detection in Clutter

投稿日: 2025年5月6日作成者: jarxiv

要約雑然とした実環境における把持ポーズ検出は、ノイズが多く不完全な感覚データと … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Multimodal Deep Learning for Stroke Prediction and Detection using Retinal Imaging and Clinical Data

投稿日: 2025年5月6日作成者: jarxiv

要約脳卒中は公衆衛生上の大きな問題であり、世界中で数百万人が罹患している。ディ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Enhancing person re-identification via Uncertainty Feature Fusion Method and Auto-weighted Measure Combination

投稿日: 2025年5月6日作成者: jarxiv

要約人物の再同定（Re-ID）は、監視システムにおいて、異なるカメラビュー間で … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Active Data Curation Effectively Distills Large-Scale Multimodal Models

投稿日: 2025年5月6日作成者: jarxiv

要約知識蒸留（KD）は、大規模なモデルをより小さなモデルに圧縮するためのデファ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Dance of Fireworks: An Interactive Broadcast Gymnastics Training System Based on Pose Estimation

投稿日: 2025年5月6日作成者: jarxiv

要約本研究では、ラジオ体操への取り組みを強化することで、座りっぱなしの健康リス … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Structure Causal Models and LLMs Integration in Medical Visual Question Answering

投稿日: 2025年5月6日作成者: jarxiv

要約医療ビジュアル質問応答（Medical Visual Question A … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery

投稿日: 2025年5月6日作成者: jarxiv

要約本研究では、ロバストな単眼奥行きスケール復元法を提案する。単眼的奥行き推定 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multi-View Learning with Context-Guided Receptance for Image Denoising

投稿日: 2025年5月6日作成者: jarxiv

要約画像ノイズ除去は、写真撮影や自動運転などの低レベル視覚アプリケーションにお … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

A Rate-Quality Model for Learned Video Coding

投稿日: 2025年5月6日作成者: jarxiv

要約学習型ビデオ符号化（LVC）は近年、優れた符号化性能を達成している。本論文 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Sim2Real in endoscopy segmentation with a novel structure aware image translation

Grasp the Graph (GtG) 2.0: Ensemble of GNNs for High-Precision Grasp Pose Detection in Clutter

Multimodal Deep Learning for Stroke Prediction and Detection using Retinal Imaging and Clinical Data

Enhancing person re-identification via Uncertainty Feature Fusion Method and Auto-weighted Measure Combination

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Dance of Fireworks: An Interactive Broadcast Gymnastics Training System Based on Pose Estimation

Structure Causal Models and LLMs Integration in Medical Visual Question Answering

Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery

Multi-View Learning with Context-Guided Receptance for Image Denoising

A Rate-Quality Model for Learned Video Coding

最近の投稿

最近のコメント

アーカイブ

カテゴリー