「cs.CV」カテゴリーアーカイブ

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

投稿日: 2024年10月23日作成者: jarxiv

要約視覚言語モデル (VLM) は、複雑な視覚言語推論を評価する最近の視覚質問 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

AlphaChimp: Tracking and Behavior Recognition of Chimpanzees

投稿日: 2024年10月23日作成者: jarxiv

要約ヒト以外の霊長類の行動を理解することは、動物福祉を改善し、社会的行動をモデ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

投稿日: 2024年10月23日作成者: jarxiv

要約ビジュアルデータは、わずか数ピクセルの小さなアイコンから数時間にわたる長 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

YOLO-TS: Real-Time Traffic Sign Detection with Enhanced Accuracy Using Optimized Receptive Fields and Anchor-Free Fusion

投稿日: 2024年10月23日作成者: jarxiv

要約自動運転と先進運転支援システム (ADAS) の両方における安全性の確保は … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging

投稿日: 2024年10月23日作成者: jarxiv

要約大規模な事前トレーニング済みモデルは、さまざまなタスクにわたって優れたゼロ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Are Visual-Language Models Effective in Action Recognition? A Comparative Study

投稿日: 2024年10月23日作成者: jarxiv

要約 CLIP などの現在のビジョン言語基盤モデルは、最近、さまざまな下流タスク … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

KANICE: Kolmogorov-Arnold Networks with Interactive Convolutional Elements

投稿日: 2024年10月23日作成者: jarxiv

要約畳み込みニューラルネットワーク (CNN) とコルモゴロフアーノルド … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results

投稿日: 2024年10月23日作成者: jarxiv

要約ビデオ品質評価 (VQA) は、視聴者のエクスペリエンスに直接影響を与える … 続きを読む →

カテゴリー: cs.CV, cs.MM, eess.IV | コメントを受け付けていません

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios

投稿日: 2024年10月23日作成者: jarxiv

要約データセットの蒸留は、CIFAR、MNIST、TinyImageNet な … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

EPContrast: Effective Point-level Contrastive Learning for Large-scale Point Cloud Understanding

投稿日: 2024年10月23日作成者: jarxiv

要約点レベルの対比学習による帰納的バイアスの獲得は、点群の事前トレーニングにお … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

AlphaChimp: Tracking and Behavior Recognition of Chimpanzees

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

YOLO-TS: Real-Time Traffic Sign Detection with Enhanced Accuracy Using Optimized Receptive Fields and Anchor-Free Fusion

LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging

Are Visual-Language Models Effective in Action Recognition? A Comparative Study

KANICE: Kolmogorov-Arnold Networks with Interactive Convolutional Elements

AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios

EPContrast: Effective Point-level Contrastive Learning for Large-scale Point Cloud Understanding

最近の投稿

最近のコメント

アーカイブ

カテゴリー