月別アーカイブ: 2024年1月

Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation

投稿日: 2024年1月1日作成者: jarxiv

要約視覚的グラウンディングは、画像の特定領域の視覚情報を対応する自然言語表現と … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Multi-stage feature decorrelation constraints for improving CNN classification performance

投稿日: 2024年1月1日作成者: jarxiv

要約パターン分類に使用される畳み込みニューラルネットワーク (CNN) の場 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

投稿日: 2024年1月1日作成者: jarxiv

要約一般的な視覚に関する広範な研究とは対照的に、スケーラブルな視覚自動運転のた … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Revealing the Underlying Patterns: Investigating Dataset Similarity, Performance, and Generalization

投稿日: 2024年1月1日作成者: jarxiv

要約教師あり深層学習モデルでは、特定のタスクで許容可能なパフォーマンスを達成す … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models

投稿日: 2024年1月1日作成者: jarxiv

要約 OpenAI の GPT-4V(ision) など、マルチモーダル大規模言 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

CoreDeep: Improving Crack Detection Algorithms Using Width Stochasticity

投稿日: 2024年1月1日作成者: jarxiv

要約画像内の亀裂を自動的に検出またはセグメント化すると、メンテナンスや運用のコ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Shape-IoU: More Accurate Metric considering Bounding Box Shape and Scale

投稿日: 2024年1月1日作成者: jarxiv

要約境界ボックス回帰損失は、検出器位置特定ブランチの重要なコンポーネントとして … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Exploring Iterative Refinement with Diffusion Models for Video Grounding

投稿日: 2024年1月1日作成者: jarxiv

要約ビデオグラウンディングは、特定の文のクエリに対応するトリミングされていな … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning

投稿日: 2024年1月1日作成者: jarxiv

要約視覚言語モデル (VLM) は、人間としての常識的な知識に基づいて推論でき … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA

投稿日: 2024年1月1日作成者: jarxiv

要約ウィリス環 (CoW) は、脳の主要な循環を接続する重要な動脈ネットワーク … 続きを読む →

カテゴリー: cs.CV, cs.LG, q-bio.QM, q-bio.TO | コメントを受け付けていません

月別アーカイブ: 2024年1月

Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation

Multi-stage feature decorrelation constraints for improving CNN classification performance

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

Revealing the Underlying Patterns: Investigating Dataset Similarity, Performance, and Generalization

Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models

CoreDeep: Improving Crack Detection Algorithms Using Width Stochasticity

Shape-IoU: More Accurate Metric considering Bounding Box Shape and Scale

Exploring Iterative Refinement with Diffusion Models for Video Grounding

Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning

Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA

最近の投稿

最近のコメント

アーカイブ

カテゴリー