月別アーカイブ: 2024年3月

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

投稿日: 2024年3月20日作成者: jarxiv

要約オープンドメインの 3D オブジェクト合成は、データが限られていて計算が複 … 続きを読む →

カテゴリー: cs.CV, cs.GR | コメントを受け付けていません

MEDBind: Unifying Language and Multimodal Medical Data Embeddings

投稿日: 2024年3月20日作成者: jarxiv

要約医療視覚言語事前トレーニングモデル (VLPM) は、胸部 X 線 (C … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

投稿日: 2024年3月20日作成者: jarxiv

要約構造情報は、文書、表、グラフなどのテキストの多い画像のセマンティクスを理解 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SynCDR : Training Cross Domain Retrieval Models with Synthetic Data

投稿日: 2024年3月20日作成者: jarxiv

要約クロスドメイン検索では、2 つの視覚ドメインにわたって同じ意味カテゴリから … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation

投稿日: 2024年3月20日作成者: jarxiv

要約適切に展開された UV を取得することが難しいため、セマンティック UV … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Self-Supervised Learning for Image Super-Resolution and Deblurring

投稿日: 2024年3月20日作成者: jarxiv

要約自己教師あり手法は、さまざまなイメージング逆問題において教師あり手法とほぼ … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Vertical Federated Image Segmentation

投稿日: 2024年3月20日作成者: jarxiv

要約画像ベースの問題に対する AI ソリューションの普及に伴い、データのプライ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.DC, cs.LG, I.2.8 | コメントを受け付けていません

Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model

投稿日: 2024年3月20日作成者: jarxiv

要約超高解像度画像合成用に設計された新しいアーキテクチャであるピラミッド拡散モ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Align before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition

投稿日: 2024年3月20日作成者: jarxiv

要約大規模な視覚言語の事前トレーニング済みモデルは、さまざまなビデオタスクで … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling

投稿日: 2024年3月20日作成者: jarxiv

要約顔表情認識 (FER) は、コンピュータビジョンにおいて重要な役割を果た … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年3月

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

MEDBind: Unifying Language and Multimodal Medical Data Embeddings

mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

SynCDR : Training Cross Domain Retrieval Models with Synthetic Data

TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation

Self-Supervised Learning for Image Super-Resolution and Deblurring

Vertical Federated Image Segmentation

Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model

Align before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition

Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling

最近の投稿

最近のコメント

アーカイブ

カテゴリー