月別アーカイブ: 2024年3月

MeaCap: Memory-Augmented Zero-shot Image Captioning

投稿日: 2024年3月7日作成者: jarxiv

要約適切にペアリングされた画像テキストデータを使用しないゼロショット画像キャ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Transformer-based nowcasting of radar composites from satellite images for severe weather

投稿日: 2024年3月7日作成者: jarxiv

要約気象レーダーデータはナウキャスティングにとって重要であり、数値気象予測モ … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV, physics.ao-ph | コメントを受け付けていません

Robust Quantification of Percent Emphysema on CT via Domain Attention: the Multi-Ethnic Study of Atherosclerosis (MESA) Lung Study

投稿日: 2024年3月7日作成者: jarxiv

要約コンピューター断層撮影 (CT) による肺気腫の確実な定量化は、さまざまな … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multimodal Transformer for Comics Text-Cloze

投稿日: 2024年3月7日作成者: jarxiv

要約この作品は、視覚的要素とテキスト要素が複雑に絡み合っている媒体であるコミッ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection

投稿日: 2024年3月7日作成者: jarxiv

要約最近の LiDAR ベースの 3D 物体検出 (3DOD) 手法は有望な結 … 続きを読む →

カテゴリー: cs.CV, I.2.10 | コメントを受け付けていません

Bridging Diversity and Uncertainty in Active learning with Self-Supervised Pre-Training

投稿日: 2024年3月7日作成者: jarxiv

要約この研究は、特に自己教師付き事前トレーニング済みモデルのコンテキスト内での … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Learning 3D object-centric representation through prediction

投稿日: 2024年3月7日作成者: jarxiv

要約人間の核となる知識の一部として、オブジェクトの表現は、高レベルの概念と象徴 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, I.2.10 | コメントを受け付けていません

Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models

投稿日: 2024年3月7日作成者: jarxiv

要約医療専門家が病変の一連の視覚パターンに基づいて決定を下すため、コンセプトベ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer

投稿日: 2024年3月7日作成者: jarxiv

要約生成圧縮技術の最近の進歩により、圧縮データの知覚品質が大幅に向上しました。 … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV | コメントを受け付けていません

Self-supervised Photographic Image Layout Representation Learning

投稿日: 2024年3月7日作成者: jarxiv

要約画像レイアウト表現学習の領域では、画像レイアウトを簡潔なベクトル形式に変換 … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

月別アーカイブ: 2024年3月

MeaCap: Memory-Augmented Zero-shot Image Captioning

Transformer-based nowcasting of radar composites from satellite images for severe weather

Robust Quantification of Percent Emphysema on CT via Domain Attention: the Multi-Ethnic Study of Atherosclerosis (MESA) Lung Study

Multimodal Transformer for Comics Text-Cloze

CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection

Bridging Diversity and Uncertainty in Active learning with Self-Supervised Pre-Training

Learning 3D object-centric representation through prediction

Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models

Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer

Self-supervised Photographic Image Layout Representation Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー