「cs.CV」カテゴリーアーカイブ

Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP

投稿日: 2024年10月22日作成者: jarxiv

要約最近の研究では、CLIP の共有画像テキスト表現空間を活用することにより、 … 続きを読む →

カテゴリー: cs.CV, cs.LG, I.5.1 | コメントを受け付けていません

Managing Bandwidth: The Key to Cloud-Assisted Autonomous Driving

投稿日: 2024年10月22日作成者: jarxiv

要約一般的な通念では、自動運転車のような重要なリアルタイム制御システムをクラウ … 続きを読む →

カテゴリー: cs.CV, cs.NI, cs.SY, eess.SY | コメントを受け付けていません

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

投稿日: 2024年10月22日作成者: jarxiv

要約大規模言語モデル (LLM) の成功により、研究者は統合された視覚的および … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Deep Radiomics Detection of Clinically Significant Prostate Cancer on Multicenter MRI: Initial Comparison to PI-RADS Assessment

投稿日: 2024年10月22日作成者: jarxiv

要約目的: 臨床的に重要な前立腺がん (csPCa、グレードグループ >= … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report

投稿日: 2024年10月22日作成者: jarxiv

要約この論文では、X 線、心電図 (ECG)、および放射線学/心臓病学のレポー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Revisiting Deep Feature Reconstruction for Logical and Structural Industrial Anomaly Detection

投稿日: 2024年10月22日作成者: jarxiv

要約産業用異常検出は品質管理と予知保全にとって重要ですが、トレーニングデータ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Elucidating the design space of language models for image generation

投稿日: 2024年10月22日作成者: jarxiv

要約テキスト生成における自己回帰 (AR) 言語モデルの成功により、コンピュー … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos

投稿日: 2024年10月22日作成者: jarxiv

要約我々は、カジュアルな縦断ビデオコレクションから 3D エージェントのインタ … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.RO | コメントを受け付けていません

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

投稿日: 2024年10月22日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は、幅広い領域にわたる視覚言 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

投稿日: 2024年10月22日作成者: jarxiv

要約ノベルビュー合成は、複数の入力画像またはビデオからシーンの新しいビューを生 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP

Managing Bandwidth: The Key to Cloud-Assisted Autonomous Driving

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Deep Radiomics Detection of Clinically Significant Prostate Cancer on Multicenter MRI: Initial Comparison to PI-RADS Assessment

MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report

Revisiting Deep Feature Reconstruction for Logical and Structural Industrial Anomaly Detection

Elucidating the design space of language models for image generation

Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

最近の投稿

最近のコメント

アーカイブ

カテゴリー