「cs.CV」カテゴリーアーカイブ

TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

投稿日: 2025年3月17日作成者: jarxiv

要約タスク指向のハンドオブジェクトインタラクションビデオ生成の既存のデータセッ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Aligning First, Then Fusing: A Novel Weakly Supervised Multimodal Violence Detection Method

投稿日: 2025年3月17日作成者: jarxiv

要約弱く監視されている暴力検出とは、ビデオレベルのラベルのみを使用してビデオの … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation

投稿日: 2025年3月17日作成者: jarxiv

要約細胞インスタンスセグメンテーション（CIS）は、組織病理学的画像の個々の細 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multi-modal Vision Pre-training for Medical Image Analysis

投稿日: 2025年3月17日作成者: jarxiv

要約自己学習学習は、実際のアプリケーションのトレーニングデータ要件を抑制するこ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Remote Photoplethysmography in Real-World and Extreme Lighting Scenarios

投稿日: 2025年3月17日作成者: jarxiv

要約生理学的活動は、顔のイメージングの敏感な変化によって明らかになる可能性があ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Visual Adaptive Prompting for Compositional Zero-Shot Learning

投稿日: 2025年3月17日作成者: jarxiv

要約 Vision-Language Models（VLMS）は、視覚データとテ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation

投稿日: 2025年3月17日作成者: jarxiv

要約最近のテキストからイメージまでの生成モデルは印象的なパフォーマンスを達成し … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning

投稿日: 2025年3月17日作成者: jarxiv

要約人間のプロセスビデオ推論を順次空間的推論ロジックで、最初に関連するフレーム … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Cognitive Disentanglement for Referring Multi-Object Tracking

投稿日: 2025年3月17日作成者: jarxiv

要約インテリジェント輸送知覚システムにおけるマルチソース情報融合の重要なアプリ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Cloud2BIM: An open-source automatic pipeline for efficient conversion of large-scale point clouds into IFC format

投稿日: 2025年3月17日作成者: jarxiv

要約ビルディング情報モデリング（BIM）は、老化構造の持続可能な再構築と再生に … 続きを読む →

カテゴリー: cs.CV, cs.SE | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

Aligning First, Then Fusing: A Novel Weakly Supervised Multimodal Violence Detection Method

COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation

Multi-modal Vision Pre-training for Medical Image Analysis

Remote Photoplethysmography in Real-World and Extreme Lighting Scenarios

Visual Adaptive Prompting for Compositional Zero-Shot Learning

T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation

V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning

Cognitive Disentanglement for Referring Multi-Object Tracking

Cloud2BIM: An open-source automatic pipeline for efficient conversion of large-scale point clouds into IFC format

最近の投稿

最近のコメント

アーカイブ

カテゴリー