「cs.CV」カテゴリーアーカイブ

IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models

投稿日: 2025年4月14日作成者: jarxiv

要約ビジョンと言語（VL）の理解の分野は、エンドツーエンドの大規模な事前訓練V … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations

投稿日: 2025年4月14日作成者: jarxiv

要約視覚的理解は本質的に文脈的です – 画像で焦点を当てるものは、 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation

投稿日: 2025年4月14日作成者: jarxiv

要約合成画像の生成は、コンピュータービジョンモデルをトレーニングするためのラベ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model

投稿日: 2025年4月14日作成者: jarxiv

要約新しいオブジェクトを3Dコンテンツに生成して挿入することは、汎用性の高いシ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

ODverse33: Is the New YOLO Version Always Better? A Multi Domain benchmark from YOLO v5 to v11

投稿日: 2025年4月14日作成者: jarxiv

要約さまざまなドメインにわたってリアルタイムオブジェクト検出器の構築に広く使用 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

TACO: Adversarial Camouflage Optimization on Trucks to Fool Object Detectors

投稿日: 2025年4月14日作成者: jarxiv

要約敵対的な攻撃は、自律車両や防衛システムなどの重要なアプリケーションにおける … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

A Hybrid Fully Convolutional CNN-Transformer Model for Inherently Interpretable Medical Image Classification

投稿日: 2025年4月14日作成者: jarxiv

要約多くの医療イメージングタスクでは、畳み込みニューラルネットワーク（CNNS … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

投稿日: 2025年4月14日作成者: jarxiv

要約 Video Variation Autoencoder（VAE）はビデオを … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Open-CD: A Comprehensive Toolbox for Change Detection

投稿日: 2025年4月14日作成者: jarxiv

要約 Open-CDを提示します。これは、関連するコンポーネントとモジュールと同 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multi-head Ensemble of Smoothed Classifiers for Certified Robustness

投稿日: 2025年4月14日作成者: jarxiv

要約ランダム化スムージング（RS）は、認定された堅牢性のための有望な手法であり … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models

FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations

Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation

Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model

ODverse33: Is the New YOLO Version Always Better? A Multi Domain benchmark from YOLO v5 to v11

TACO: Adversarial Camouflage Optimization on Trucks to Fool Object Detectors

A Hybrid Fully Convolutional CNN-Transformer Model for Inherently Interpretable Medical Image Classification

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

Open-CD: A Comprehensive Toolbox for Change Detection

Multi-head Ensemble of Smoothed Classifiers for Certified Robustness

最近の投稿

最近のコメント

アーカイブ

カテゴリー