月別アーカイブ: 2024年6月

CarLLaVA: Vision language models for camera-only closed-loop driving

投稿日: 2024年6月17日作成者: jarxiv

要約この技術レポートでは、CARLA 自動運転チャレンジ 2.0 のために開発 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

4DRecons: 4D Neural Implicit Deformable Objects Reconstruction from a single RGB-D Camera with Geometrical and Topological Regularizations

投稿日: 2024年6月17日作成者: jarxiv

要約この論文では、単一カメラの動的な被写体の RGB-D シーケンスを入力とし … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights

投稿日: 2024年6月17日作成者: jarxiv

要約 Web スケールのビジョン言語データセット間には、当然ながら深刻なデータの … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Enhancing Incomplete Multi-modal Brain Tumor Segmentation with Intra-modal Asymmetry and Inter-modal Dependency

投稿日: 2024年6月17日作成者: jarxiv

要約マルチモーダル MRI 画像用の深層学習ベースの脳腫瘍セグメンテーション … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

投稿日: 2024年6月17日作成者: jarxiv

要約 DensePose は、画像と 3D メッシュ座標とのピクセル精度の関連付 … 続きを読む →

カテゴリー: 68, cs.CV, I.2.10 | コメントを受け付けていません

Detecting and Evaluating Medical Hallucinations in Large Vision Language Models

投稿日: 2024年6月17日作成者: jarxiv

要約 Large Vision Language Model (LVLM) は、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis

投稿日: 2024年6月17日作成者: jarxiv

要約画像分類を解釈可能にするための Transformers の新しい使用法を … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Crafting Parts for Expressive Object Composition

投稿日: 2024年6月17日作成者: jarxiv

要約 Stable Diffusion、DALLE-2 などの大規模な生成モデル … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

SSTFB: Leveraging self-supervised pretext learning and temporal self-attention with feature branching for real-time video polyp segmentation

投稿日: 2024年6月17日作成者: jarxiv

要約ポリープは早期がんの指標であるため、ポリープの発生とその切除を評価すること … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM | コメントを受け付けていません

Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering

投稿日: 2024年6月17日作成者: jarxiv

要約最近、Glyph-ByT5 は、グラフィックデザイン画像における高精度の … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年6月

CarLLaVA: Vision language models for camera-only closed-loop driving

4DRecons: 4D Neural Implicit Deformable Objects Reconstruction from a single RGB-D Camera with Geometrical and Topological Regularizations

Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights

Enhancing Incomplete Multi-modal Brain Tumor Segmentation with Intra-modal Asymmetry and Inter-modal Dependency

MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

Detecting and Evaluating Medical Hallucinations in Large Vision Language Models

A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis

Crafting Parts for Expressive Object Composition

SSTFB: Leveraging self-supervised pretext learning and temporal self-attention with feature branching for real-time video polyp segmentation

Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering

最近の投稿

最近のコメント

アーカイブ

カテゴリー