月別アーカイブ: 2025年3月

MTV-Inpaint: Multi-Task Long Video Inpainting

投稿日: 2025年3月17日作成者: jarxiv

要約ビデオの開始には、ビデオ内のローカル領域を変更し、空間的および時間的な一貫 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Category Prompt Mamba Network for Nuclei Segmentation and Classification

投稿日: 2025年3月17日作成者: jarxiv

要約核のセグメンテーションと分類は、腫瘍免疫微小環境分析に不可欠な基盤を提供し … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos

投稿日: 2025年3月17日作成者: jarxiv

要約長型のビデオ理解は、ビデオデータの冗長性が高いことと、クエリと関係のある情 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

AQUA-SLAM: Tightly-Coupled Underwater Acoustic-Visual-Inertial SLAM with Sensor Calibration

投稿日: 2025年3月17日作成者: jarxiv

要約水中環境は、視認性が限られていること、不十分な照明、および画像の構造的特徴 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

投稿日: 2025年3月17日作成者: jarxiv

要約タスク指向のハンドオブジェクトインタラクションビデオ生成の既存のデータセッ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Aligning First, Then Fusing: A Novel Weakly Supervised Multimodal Violence Detection Method

投稿日: 2025年3月17日作成者: jarxiv

要約弱く監視されている暴力検出とは、ビデオレベルのラベルのみを使用してビデオの … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation

投稿日: 2025年3月17日作成者: jarxiv

要約細胞インスタンスセグメンテーション（CIS）は、組織病理学的画像の個々の細 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multi-modal Vision Pre-training for Medical Image Analysis

投稿日: 2025年3月17日作成者: jarxiv

要約自己学習学習は、実際のアプリケーションのトレーニングデータ要件を抑制するこ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Remote Photoplethysmography in Real-World and Extreme Lighting Scenarios

投稿日: 2025年3月17日作成者: jarxiv

要約生理学的活動は、顔のイメージングの敏感な変化によって明らかになる可能性があ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Visual Adaptive Prompting for Compositional Zero-Shot Learning

投稿日: 2025年3月17日作成者: jarxiv

要約 Vision-Language Models（VLMS）は、視覚データとテ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

月別アーカイブ: 2025年3月

MTV-Inpaint: Multi-Task Long Video Inpainting

Category Prompt Mamba Network for Nuclei Segmentation and Classification

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos

AQUA-SLAM: Tightly-Coupled Underwater Acoustic-Visual-Inertial SLAM with Sensor Calibration

TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

Aligning First, Then Fusing: A Novel Weakly Supervised Multimodal Violence Detection Method

COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation

Multi-modal Vision Pre-training for Medical Image Analysis

Remote Photoplethysmography in Real-World and Extreme Lighting Scenarios

Visual Adaptive Prompting for Compositional Zero-Shot Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー