「cs.CV」カテゴリーアーカイブ

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

投稿日: 2025年4月25日作成者: jarxiv

要約さまざまなビジョンや言語タスクが可能なマルチモーダルの自己回帰モデルのファ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DDU-Net: A Domain Decomposition-Based CNN for High-Resolution Image Segmentation on Multiple GPUs

投稿日: 2025年4月25日作成者: jarxiv

要約超高解像度画像のセグメンテーションは、空間情報の喪失や計算非効率性などの課 … 続きを読む →

カテゴリー: 65N55, 68T07, 68U10, 68W10, 68W15, cs.CV, cs.DC, cs.LG, I.2.6 | コメントを受け付けていません

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

投稿日: 2025年4月25日作成者: jarxiv

要約対照的な言語イメージ前削除（CLIP）は、クロスモーダル情報の検索およびマ … 続きを読む →

カテゴリー: 68T50, cs.CL, cs.CV, cs.IR, I.2.10 | コメントを受け付けていません

CasualHDRSplat: Robust High Dynamic Range 3D Gaussian Splatting from Casually Captured Videos

投稿日: 2025年4月25日作成者: jarxiv

要約最近、ニューラル放射輝度フィールド（NERF）や3Dガウスのスプラッティン … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.MM | コメントを受け付けていません

DPMambaIR:All-in-One Image Restoration via Degradation-Aware Prompt State Space Model

投稿日: 2025年4月25日作成者: jarxiv

要約オールインワン画像の修復は、単一のモデルを使用して複数の画像劣化の問題に対 … 続きを読む →

カテゴリー: cs.CV, I.4.4 | コメントを受け付けていません

EgoCHARM: Resource-Efficient Hierarchical Activity Recognition using an Egocentric IMU Sensor

投稿日: 2025年4月25日作成者: jarxiv

要約 SmartGlassesの人間の活動認識（HAR）には、健康/フィットネス … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Step1X-Edit: A Practical Framework for General Image Editing

投稿日: 2025年4月25日作成者: jarxiv

要約近年、画像編集モデルは驚くべき急速な発展を目撃しています。 GPT-4Oや … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images

投稿日: 2025年4月25日作成者: jarxiv

要約医療画像技術の進歩により、疾患の進行を監視するために、同じ患者の繰り返しス … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV | コメントを受け付けていません

DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images

投稿日: 2025年4月25日作成者: jarxiv

要約自動化された全体のスライドスキャンの進歩によって駆動されるデジタル顕微鏡画 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding

投稿日: 2025年4月25日作成者: jarxiv

要約マルチモーダルの大手言語モデル（MLLM）の進歩にもかかわらず、現在のアプ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

DDU-Net: A Domain Decomposition-Based CNN for High-Resolution Image Segmentation on Multiple GPUs

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

CasualHDRSplat: Robust High Dynamic Range 3D Gaussian Splatting from Casually Captured Videos

DPMambaIR:All-in-One Image Restoration via Degradation-Aware Prompt State Space Model

EgoCHARM: Resource-Efficient Hierarchical Activity Recognition using an Egocentric IMU Sensor

Step1X-Edit: A Practical Framework for General Image Editing

ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images

DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images

HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding

最近の投稿

最近のコメント

アーカイブ

カテゴリー