「cs.CV」カテゴリーアーカイブ

AI-Driven Diabetic Retinopathy Screening: Multicentric Validation of AIDRSS in India

投稿日: 2025年1月13日作成者: jarxiv

要約目的: 糖尿病性網膜症 (DR) は、特にインドにおいては視力喪失の主な原 … 続きを読む →

カテゴリー: cs.AI, cs.CV, eess.IV | コメントを受け付けていません

Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine

投稿日: 2025年1月13日作成者: jarxiv

要約近年、マルチモーダル大規模言語モデル (MLLM) が顕著な進歩を遂げ、イ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

MoColl: Agent-Based Specific and General Model Collaboration for Image Captioning

投稿日: 2025年1月13日作成者: jarxiv

要約画像キャプションは、コンピュータービジョンと自然言語処理が交わる重要なタ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Long Story Short: Story-level Video Understanding from 20K Short Films

投稿日: 2025年1月13日作成者: jarxiv

要約視覚言語モデルの最近の開発により、ビデオの理解が大幅に進歩しました。ただ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

VLM-driven Behavior Tree for Context-aware Task Planning

投稿日: 2025年1月13日作成者: jarxiv

要約ビヘイビアツリー (BT) を生成するための大規模言語モデル (LLM) … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.HC, cs.RO | コメントを受け付けていません

VideoRAG: Retrieval-Augmented Generation over Video Corpus

投稿日: 2025年1月13日作成者: jarxiv

要約検索拡張生成 (RAG) は、クエリに関連する外部知識を取得し、それを生成 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.IR, cs.LG | コメントを受け付けていません

Gender Bias in Text-to-Video Generation Models: A case study of Sora

投稿日: 2025年1月13日作成者: jarxiv

要約テキストからビデオへの生成モデルの出現は、テキストのプロンプトから高品質の … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.CY, cs.LG | コメントを受け付けていません

EDNet: Edge-Optimized Small Target Detection in UAV Imagery — Faster Context Attention, Better Feature Fusion, and Hardware Acceleration

投稿日: 2025年1月13日作成者: jarxiv

要約低解像度、複雑な背景、ダイナミックなシーンのため、ドローン画像内の小さなタ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Backdoor Attacks against No-Reference Image Quality Assessment Models via a Scalable Trigger

投稿日: 2025年1月13日作成者: jarxiv

要約参照なし画像品質評価 (NR-IQA) は、参照を使用せずに単一の入力画像 … 続きを読む →

カテゴリー: cs.CR, cs.CV | コメントを受け付けていません

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training

投稿日: 2025年1月13日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は、一般的なタスクには習熟し … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

AI-Driven Diabetic Retinopathy Screening: Multicentric Validation of AIDRSS in India

Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine

MoColl: Agent-Based Specific and General Model Collaboration for Image Captioning

Long Story Short: Story-level Video Understanding from 20K Short Films

VLM-driven Behavior Tree for Context-aware Task Planning

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Gender Bias in Text-to-Video Generation Models: A case study of Sora

EDNet: Edge-Optimized Small Target Detection in UAV Imagery — Faster Context Attention, Better Feature Fusion, and Hardware Acceleration

Backdoor Attacks against No-Reference Image Quality Assessment Models via a Scalable Trigger

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training

最近の投稿

最近のコメント

アーカイブ

カテゴリー