「cs.CV」カテゴリーアーカイブ

Controllable Image Colorization with Instance-aware Texts and Masks

投稿日: 2025年5月14日作成者: jarxiv

要約最近、画像色に深い学習を適用することは、広範囲にわたる注目を集めています。 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

投稿日: 2025年5月14日作成者: jarxiv

要約画像テキストペアデータと比較して、インターリーブコーポラは、ビジョン言語モ … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Unsupervised Urban Land Use Mapping with Street View Contrastive Clustering and a Geographical Prior

投稿日: 2025年5月14日作成者: jarxiv

要約都市の土地利用の分類とマッピングは、都市計画、資源管理、環境監視に不可欠で … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

RT-GAN: Recurrent Temporal GAN for Adding Lightweight Temporal Consistency to Frame-Based Domain Translation Approaches

投稿日: 2025年5月14日作成者: jarxiv

要約米国では毎年1,400万件の大腸内視鏡検査が行われていますが、これらの大腸 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series

投稿日: 2025年5月14日作成者: jarxiv

要約衛星画像の時系列（SITS）は、地球の表面の継続的な観測を提供し、環境管理 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Hierarchical and Multimodal Data for Daily Activity Understanding

投稿日: 2025年5月14日作成者: jarxiv

要約人工知能の毎日の活動記録（ダライ、「ダーリー」と発音）は、現実世界の設定で … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving

投稿日: 2025年5月14日作成者: jarxiv

要約大規模な視覚言語モデル（LVLMS）は、画像の理解が大幅に進歩しています。 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves

投稿日: 2025年5月14日作成者: jarxiv

要約最近の研究では、意味のある内部表現を学ぶことで、生成トレーニングを加速し、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Deep Representation Learning for Unsupervised Clustering of Myocardial Fiber Trajectories in Cardiac Diffusion Tensor Imaging

投稿日: 2025年5月14日作成者: jarxiv

要約複雑な心筋アーキテクチャを理解することは、心臓病の診断と治療に不可欠です。 … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV | コメントを受け付けていません

Visual Imitation Enables Contextual Humanoid Control

投稿日: 2025年5月14日作成者: jarxiv

要約ヒューマノイドに階段を登り、周囲の環境のコンテキストを使用して椅子に座るよ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Controllable Image Colorization with Instance-aware Texts and Masks

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Unsupervised Urban Land Use Mapping with Street View Contrastive Clustering and a Geographical Prior

RT-GAN: Recurrent Temporal GAN for Adding Lightweight Temporal Consistency to Frame-Based Domain Translation Approaches

TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series

Hierarchical and Multimodal Data for Daily Activity Understanding

Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving

No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves

Deep Representation Learning for Unsupervised Clustering of Myocardial Fiber Trajectories in Cardiac Diffusion Tensor Imaging

Visual Imitation Enables Contextual Humanoid Control

最近の投稿

最近のコメント

アーカイブ

カテゴリー