「cs.CV」カテゴリーアーカイブ

JPEG Inspired Deep Learning

投稿日: 2024年10月10日作成者: jarxiv

要約従来、JPEG 圧縮などの非可逆画像圧縮はディープニューラルネットワー … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Comprehensive Performance Evaluation of YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments

投稿日: 2024年10月10日作成者: jarxiv

要約この研究では、商業果樹園における緑色の果物の検出のために、YOLOv8、Y … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology

投稿日: 2024年10月10日作成者: jarxiv

要約視覚言語ナビゲーション (VLN) として知られる、言語指示と視覚情報に基 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning

投稿日: 2024年10月10日作成者: jarxiv

要約言語は人間の動作の領域において重要な役割を果たします。既存の方法は、モー … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Topologically Faithful Multi-class Segmentation in Medical Images

投稿日: 2024年10月10日作成者: jarxiv

要約医用画像セグメンテーションにおけるトポロジカルな精度は、ネットワーク解析や … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV | コメントを受け付けていません

Continual Learning: Less Forgetting, More OOD Generalization via Adaptive Contrastive Replay

投稿日: 2024年10月10日作成者: jarxiv

要約機械学習モデルは、新しいクラスを学習するときに、以前に学習した知識を壊滅的 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

VHELM: A Holistic Evaluation of Vision Language Models

投稿日: 2024年10月10日作成者: jarxiv

要約視覚言語モデル (VLM) を評価するための現在のベンチマークは、多くの場 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Personalized Visual Instruction Tuning

投稿日: 2024年10月10日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) の最近の進歩は、顕著な進歩を … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Thing2Reality: Transforming 2D Content into Conditioned Multiviews and 3D Gaussian Objects for XR Communication

投稿日: 2024年10月10日作成者: jarxiv

要約リモートコミュニケーション中、参加者は相互理解を高めるために、製品デザイン … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.HC | コメントを受け付けていません

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

投稿日: 2024年10月10日作成者: jarxiv

要約生成モデルの最近の進歩により、素晴らしいコンテンツを生成する際の顕著な機能 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

JPEG Inspired Deep Learning

Comprehensive Performance Evaluation of YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments

Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology

LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning

Topologically Faithful Multi-class Segmentation in Medical Images

Continual Learning: Less Forgetting, More OOD Generalization via Adaptive Contrastive Replay

VHELM: A Holistic Evaluation of Vision Language Models

Personalized Visual Instruction Tuning

Thing2Reality: Transforming 2D Content into Conditioned Multiviews and 3D Gaussian Objects for XR Communication

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー