「cs.CV」カテゴリーアーカイブ

ImageFolder: Autoregressive Image Generation with Folded Tokens

投稿日: 2024年10月16日作成者: jarxiv

要約画像トークナイザーは、モデリング用の潜在表現を構築するため、拡散モデル ( … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation

投稿日: 2024年10月16日作成者: jarxiv

要約私たちは、単一のビデオデモンストレーションを模倣して人型ロボットの操作スキ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

投稿日: 2024年10月16日作成者: jarxiv

要約拡散モデルは、近年最も人気があり人気の生成モデルの 1 つとして、多くの研 … 続きを読む →

カテゴリー: cs.CV, I.4.9 | コメントを受け付けていません

Active Label Refinement for Robust Training of Imbalanced Medical Image Classification Tasks in the Presence of High Label Noise

投稿日: 2024年10月16日作成者: jarxiv

要約教師あり深層学習ベースの医用画像分類の堅牢性は、ラベルノイズによって大幅 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

VIA: Unified Spatiotemporal Video Adaptation Framework for Global and Local Video Editing

投稿日: 2024年10月16日作成者: jarxiv

要約ビデオ編集は、エンターテインメントや教育からプロフェッショナルなコミュニケ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM | コメントを受け付けていません

Advancements in Road Lane Mapping: Comparative Fine-Tuning Analysis of Deep Learning-based Semantic Segmentation Methods Using Aerial Imagery

投稿日: 2024年10月16日作成者: jarxiv

要約この研究は、航空画像から得られる道路車線情報に焦点を当て、自動運転車 (A … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing

投稿日: 2024年10月16日作成者: jarxiv

要約シーングラフは、オブジェクトとそれらの間の関係を象徴するノードとエッジを … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Jigsaw++: Imagining Complete Shape Priors for Object Reassembly

投稿日: 2024年10月16日作成者: jarxiv

要約自動アセンブリの問題は、3D 表現を伴う複雑な課題のため、ますます関心を集 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Improving Long-Text Alignment for Text-to-Image Diffusion Models

投稿日: 2024年10月16日作成者: jarxiv

要約テキストから画像への (T2I) 拡散モデルの急速な進歩により、与えられた … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM | コメントを受け付けていません

KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities

投稿日: 2024年10月16日作成者: jarxiv

要約テキストから画像への生成における最近の進歩により、合成画像の品質が大幅に向 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

ImageFolder: Autoregressive Image Generation with Folded Tokens

OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation

Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

Active Label Refinement for Robust Training of Imbalanced Medical Image Classification Tasks in the Presence of High Label Noise

VIA: Unified Spatiotemporal Video Adaptation Framework for Global and Local Video Editing

Advancements in Road Lane Mapping: Comparative Fine-Tuning Analysis of Deep Learning-based Semantic Segmentation Methods Using Aerial Imagery

SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing

Jigsaw++: Imagining Complete Shape Priors for Object Reassembly

Improving Long-Text Alignment for Text-to-Image Diffusion Models

KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities

最近の投稿

最近のコメント

アーカイブ

カテゴリー