「cs.CV」カテゴリーアーカイブ

Latent Inversion with Timestep-aware Sampling for Training-free Non-rigid Editing

投稿日: 2024年10月17日作成者: jarxiv

要約テキストガイドによる非剛体編集には、周囲の動きや構成を変更するなど、入力画 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models

投稿日: 2024年10月17日作成者: jarxiv

要約大規模視覚言語モデル (LVLM) における視覚言語の調整により、LLM … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation

投稿日: 2024年10月17日作成者: jarxiv

要約テキストからイメージへの生成における制御可能な出力に対する需要の高まりによ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MambaBEV: An efficient 3D detection model with Mamba2

投稿日: 2024年10月17日作成者: jarxiv

要約時間情報を備えた BEV パラダイムに基づく安定した 3D 物体検出モデル … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Understanding Figurative Meaning through Explainable Visual Entailment

投稿日: 2024年10月17日作成者: jarxiv

要約大規模視覚言語モデル (VLM) は、視覚的な質問応答や視覚的な含意など、 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2

投稿日: 2024年10月17日作成者: jarxiv

要約解剖学的ランドマークは、ナビゲーションや異常検出のための医療画像処理におい … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Machine Learning Approach to Brain Tumor Detection and Classification

投稿日: 2024年10月17日作成者: jarxiv

要約脳腫瘍の検出と分類は、医用画像解析、特に早期診断において重要なタスクであり … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines

投稿日: 2024年10月17日作成者: jarxiv

要約製造パイプラインにおける異常検出は依然として重要な課題であり、産業環境の複 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

VividMed: Vision Language Model with Versatile Visual Grounding for Medicine

投稿日: 2024年10月17日作成者: jarxiv

要約ビジョン言語モデル (VLM) の最近の進歩により、視覚に基づいた応答を生 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

MultiCamCows2024 — A Multi-view Image Dataset for AI-driven Holstein-Friesian Cattle Re-Identification on a Working Farm

投稿日: 2024年10月17日作成者: jarxiv

要約私たちは、ホルスタインフリージアン牛の独特の白と黒の毛皮パターンを利用し … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Latent Inversion with Timestep-aware Sampling for Training-free Non-rigid Editing

Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models

3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation

MambaBEV: An efficient 3D detection model with Mamba2

Understanding Figurative Meaning through Explainable Visual Entailment

Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2

Machine Learning Approach to Brain Tumor Detection and Classification

AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines

VividMed: Vision Language Model with Versatile Visual Grounding for Medicine

MultiCamCows2024 — A Multi-view Image Dataset for AI-driven Holstein-Friesian Cattle Re-Identification on a Working Farm

最近の投稿

最近のコメント

アーカイブ

カテゴリー