「cs.CV」カテゴリーアーカイブ

Boltzmann Attention Sampling for Image Analysis with Small Objects

投稿日: 2025年3月5日作成者: jarxiv

要約肺結節や腫瘍病変などの小さなオブジェクトの検出とセグメント化は、画像分析に … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multimodal Deep Learning for Subtype Classification in Breast Cancer Using Histopathological Images and Gene Expression Data

投稿日: 2025年3月5日作成者: jarxiv

要約乳がんの分子サブタイピングは、個別化された治療と予後に不可欠です。従来の … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors

投稿日: 2025年3月5日作成者: jarxiv

要約クラス内の学生活動の監視と予測は、エンゲージメントの理解と教育的有効性の向 … 続きを読む →

カテゴリー: cs.CV, cs.HC | コメントを受け付けていません

Deepfake-Eval-2024: A Multi-Modal In-the-Wild Benchmark of Deepfakes Circulated in 2024

投稿日: 2025年3月5日作成者: jarxiv

要約ますます現実的に生成されるAIの時代には、詐欺と偽情報を緩和するためには、 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.CY | コメントを受け付けていません

VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning

投稿日: 2025年3月5日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLM）は、視覚情報とテキスト情報を統合す … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge

投稿日: 2025年3月5日作成者: jarxiv

要約 Generalist Vision Language Models（VLM … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models

投稿日: 2025年3月5日作成者: jarxiv

要約計算病理学でAIを進めるには、大規模で高品質で多様なデータセットが必要です … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models

投稿日: 2025年3月5日作成者: jarxiv

要約既存の自己回帰（AR）画像生成モデルは、トークンごとの生成スキーマを使用し … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A Survey on Vision-Language-Action Models for Embodied AI

投稿日: 2025年3月5日作成者: jarxiv

要約具体化されたAIは、物理世界でタスクを実行するために具体化されたエージェン … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.RO | コメントを受け付けていません

3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds

投稿日: 2025年3月5日作成者: jarxiv

要約 3Dアフォーダンス検出は、さまざまなロボットタスクに関する幅広いアプリケー … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Boltzmann Attention Sampling for Image Analysis with Small Objects

Multimodal Deep Learning for Subtype Classification in Breast Cancer Using Histopathological Images and Gene Expression Data

CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors

Deepfake-Eval-2024: A Multi-Modal In-the-Wild Benchmark of Deepfakes Circulated in 2024

VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning

VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge

SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models

ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models

A Survey on Vision-Language-Action Models for Embodied AI

3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds

最近の投稿

最近のコメント

アーカイブ

カテゴリー