「cs.CV」カテゴリーアーカイブ

Generation of synthetic gait data: application to multiple sclerosis patients’ gait patterns

投稿日: 2024年11月18日作成者: jarxiv

要約多発性硬化症（MS）は、若年成人における重度の非外傷性障害の主な原因であり … 続きを読む →

カテゴリー: cs.CV, stat.AP | コメントを受け付けていません

Deep Learning for Micro-Scale Crack Detection on Imbalanced Datasets Using Key Point Localization

投稿日: 2024年11月18日作成者: jarxiv

要約内部亀裂の検出は、構造健全性モニタリングの焦点となっています。構造データ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

On the Foundation Model for Cardiac MRI Reconstruction

投稿日: 2024年11月18日作成者: jarxiv

要約近年、機械学習 (ML) ベースの再構成が広く研究され、心臓磁気共鳴 (C … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Repurposing Stable Diffusion Attention for Training-Free Unsupervised Interactive Segmentation

投稿日: 2024年11月18日作成者: jarxiv

要約インタラクティブなポイントプロンプトベースの画像セグメンテーションの最 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Llama Guard 3 Vision: Safeguarding Human-AI Image Understanding Conversations

投稿日: 2024年11月18日作成者: jarxiv

要約画像理解を伴う人間と AI の会話のためのマルチモーダル LLM ベースの … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Treat Visual Tokens as Text? But Your MLLM Only Needs Fewer Efforts to See

投稿日: 2024年11月18日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は、ビジュアルエンコーダか … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation

投稿日: 2024年11月18日作成者: jarxiv

要約コンピュータービジョンには、画像生成のための新しい自己回帰パラダイムを提 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization

投稿日: 2024年11月18日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は幻覚を起こすことが知られて … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.MM | コメントを受け付けていません

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

投稿日: 2024年11月18日作成者: jarxiv

要約 OpenAI の o1 などのモデルに示されているように、大規模な言語モデ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

投稿日: 2024年11月18日作成者: jarxiv

要約既存のオープンソースのマルチモーダル大規模言語モデル (MLLM) は通常 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Generation of synthetic gait data: application to multiple sclerosis patients’ gait patterns

Deep Learning for Micro-Scale Crack Detection on Imbalanced Datasets Using Key Point Localization

On the Foundation Model for Cardiac MRI Reconstruction

Repurposing Stable Diffusion Attention for Training-Free Unsupervised Interactive Segmentation

Llama Guard 3 Vision: Safeguarding Human-AI Image Understanding Conversations

Treat Visual Tokens as Text? But Your MLLM Only Needs Fewer Efforts to See

M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation

Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

最近の投稿

最近のコメント

アーカイブ

カテゴリー