月別アーカイブ: 2024年5月

A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts

投稿日: 2024年5月17日作成者: jarxiv

要約脳病変のセグメンテーションは、神経学の研究と診断において重要な役割を果たし … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology

投稿日: 2024年5月17日作成者: jarxiv

要約計算病理学の基礎モデルは、精密医療のための新しい臨床意思決定支援システムと … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV | コメントを受け付けていません

When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models

投稿日: 2024年5月17日作成者: jarxiv

要約大規模言語モデル (LLM) が進化するにつれて、3D 空間データ (3D … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Biasing & Debiasing based Approach Towards Fair Knowledge Transfer for Equitable Skin Analysis

投稿日: 2024年5月17日作成者: jarxiv

要約深層学習モデル、特に畳み込みニューラルネットワーク (CNN) は、皮膚 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features

投稿日: 2024年5月17日作成者: jarxiv

要約この論文では、ディープニューラルネットワーク (DNN) 学習相互作用 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision

投稿日: 2024年5月17日作成者: jarxiv

要約この研究では、私たちの目標は 2 つあります。大語彙連続手話認識 (CSL … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

投稿日: 2024年5月17日作成者: jarxiv

要約この作業の目標は、自然な話し顔とテキストからの音声出力を同時に生成すること … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.SD, eess.AS, eess.IV | コメントを受け付けていません

FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models

投稿日: 2024年5月17日作成者: jarxiv

要約ノイズとキャプションの品質は視覚言語対比事前トレーニングに影響を与える重要 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

投稿日: 2024年5月17日作成者: jarxiv

要約特殊な視覚指示に従うデータに基づいて微調整された大規模なビジョン言語モデル … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Grounding DINO 1.5: Advance the ‘Edge’ of Open-Set Object Detection

投稿日: 2024年5月17日作成者: jarxiv

要約このペーパーでは、IDEA Research が開発した一連の高度なオープ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年5月

A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts

PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology

When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models

Biasing & Debiasing based Approach Towards Fair Knowledge Transfer for Equitable Skin Analysis

Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features

A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Grounding DINO 1.5: Advance the ‘Edge’ of Open-Set Object Detection

最近の投稿

最近のコメント

アーカイブ

カテゴリー