「cs.CV」カテゴリーアーカイブ

Integrating Features for Recognizing Human Activities through Optimized Parameters in Graph Convolutional Networks and Transformer Architectures

投稿日: 2024年8月30日作成者: jarxiv

要約人間の活動認識は、コンピュータービジョン、マシンビジョン、ディープラー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

CogVLM2: Visual Language Models for Image and Video Understanding

投稿日: 2024年8月30日作成者: jarxiv

要約 VisualGLM と CogVLM を皮切りに、視覚と言語の融合の強化、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

UAV-Based Human Body Detector Selection and Fusion for Geolocated Saliency Map Generation

投稿日: 2024年8月30日作成者: jarxiv

要約ソフトリアルタイムでさまざまなクラスの物体を確実に検出し地理位置特定する … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Locally Grouped and Scale-Guided Attention for Dense Pest Counting

投稿日: 2024年8月30日作成者: jarxiv

要約この研究では、デジタルトラップで捕獲された密集して分布する害虫を予測するた … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A Simple and Generalist Approach for Panoptic Segmentation

投稿日: 2024年8月30日作成者: jarxiv

要約ジェネラリストビジョンモデルは、さまざまなビジョンタスクに対応する … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation

投稿日: 2024年8月30日作成者: jarxiv

要約キャラクターアニメーションは、コンピュータグラフィックスとビジョンの変 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Learning to Detect and Segment for Open Vocabulary Object Detection

投稿日: 2024年8月30日作成者: jarxiv

要約オープンボキャブラリーのオブジェクト検出は、視覚言語の事前トレーニング済み … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

VideoMambaPro: A Leap Forward for Mamba in Video Understanding

投稿日: 2024年8月30日作成者: jarxiv

要約ビデオを理解するには、豊かな時空間表現を抽出する必要があります。これは、ト … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

On Feasibility of Intent Obfuscating Attacks

投稿日: 2024年8月30日作成者: jarxiv

要約意図の難読化は、敵対的な状況における一般的な戦術であり、攻撃者がターゲット … 続きを読む →

カテゴリー: cs.CR, cs.CV | コメントを受け付けていません

Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment

投稿日: 2024年8月30日作成者: jarxiv

要約ラベル効率の良いセグメンテーションは、トレーニングにまばらで限られたグラウ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Integrating Features for Recognizing Human Activities through Optimized Parameters in Graph Convolutional Networks and Transformer Architectures

CogVLM2: Visual Language Models for Image and Video Understanding

UAV-Based Human Body Detector Selection and Fusion for Geolocated Saliency Map Generation

Locally Grouped and Scale-Guided Attention for Dense Pest Counting

A Simple and Generalist Approach for Panoptic Segmentation

Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation

Learning to Detect and Segment for Open Vocabulary Object Detection

VideoMambaPro: A Leap Forward for Mamba in Video Understanding

On Feasibility of Intent Obfuscating Attacks

Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment

最近の投稿

最近のコメント

アーカイブ

カテゴリー