「cs.CV」カテゴリーアーカイブ

Magnifier Prompt: Tackling Multimodal Hallucination via Extremely Simple Instructions

投稿日: 2024年10月16日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) における幻覚は、実際の応用を … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.MM | コメントを受け付けていません

It’s Just Another Day: Unique Video Captioning by Discriminative Prompting

投稿日: 2024年10月16日作成者: jarxiv

要約長いビデオには、繰り返しのアクション、イベント、ショットが多数含まれていま … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Robotic Arm Platform for Multi-View Image Acquisition and 3D Reconstruction in Minimally Invasive Surgery

投稿日: 2024年10月16日作成者: jarxiv

要約低侵襲手術 (MIS) は、回復時間の短縮や患者の外傷の最小化などの大きな … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations

投稿日: 2024年10月16日作成者: jarxiv

要約対照的インスタンス識別手法は、画像分類や物体検出などの下流タスクにおいて教 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

RClicks: Realistic Click Simulation for Benchmarking Interactive Segmentation

投稿日: 2024年10月16日作成者: jarxiv

要約 Segment Anything (SAM) の出現により、特に画像編集タ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.HC, I.4.6 | コメントを受け付けていません

MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images

投稿日: 2024年10月16日作成者: jarxiv

要約この論文では、テキスト情報に基づいて高品質の 3D 肺 CT 画像を生成す … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Learning Truncated Causal History Model for Video Restoration

投稿日: 2024年10月16日作成者: jarxiv

要約ビデオ復元に対する重要な課題の 1 つは、動きによって支配されるビデオフ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection

投稿日: 2024年10月16日作成者: jarxiv

要約無人航空機 (UAV) から絶縁体欠陥を特定するための既存の検出方法は、複 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes

投稿日: 2024年10月16日作成者: jarxiv

要約トーキングフェイス生成 (TFG) は、ターゲット ID の顔をアニメーシ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems

投稿日: 2024年10月16日作成者: jarxiv

要約拡散モデルは、強力な画像事前分布を学習できるため、逆問題の解決において優れ … 続きを読む →

カテゴリー: cs.AI, cs.CV, eess.IV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Magnifier Prompt: Tackling Multimodal Hallucination via Extremely Simple Instructions

It’s Just Another Day: Unique Video Captioning by Discriminative Prompting

Robotic Arm Platform for Multi-View Image Acquisition and 3D Reconstruction in Minimally Invasive Surgery

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations

RClicks: Realistic Click Simulation for Benchmarking Interactive Segmentation

MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images

Learning Truncated Causal History Model for Video Restoration

YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection

MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes

Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems

最近の投稿

最近のコメント

アーカイブ

カテゴリー