月別アーカイブ: 2024年5月

PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control

投稿日: 2024年5月27日作成者: jarxiv

要約このペーパーでは、柔軟なポーズの制御に続いてパーソナライズされたビデオを生 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

投稿日: 2024年5月27日作成者: jarxiv

要約 CLIP (対照的言語イメージ事前トレーニング) の目覚ましい成功に基づい … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions

投稿日: 2024年5月27日作成者: jarxiv

要約ビジュアルグラウンディング (VG) は、指定された自然言語表現に一致す … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Domain Generalisation for Object Detection under Covariate and Concept Shift

投稿日: 2024年5月27日作成者: jarxiv

要約ドメイン一般化は、ドメイン固有の特徴を抑制しながら、ドメイン不変の特徴の学 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

投稿日: 2024年5月27日作成者: jarxiv

要約自己教師あり機能は、最新の機械学習システムの基礎です。通常、データ収集に … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

投稿日: 2024年5月27日作成者: jarxiv

要約単眼カメラのキャリブレーションは、多くの 3D ビジョンアプリケーション … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Align as Ideal: Cross-Modal Alignment Binding for Federated Medical Vision-Language Pre-training

投稿日: 2024年5月27日作成者: jarxiv

要約ビジョン言語事前トレーニング (VLP) は、マルチモーダル表現学習の効率 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image

投稿日: 2024年5月27日作成者: jarxiv

要約大規模再構成モデルは、単一または複数の入力画像からの自動 3D コンテ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Less is more: Summarizing Patch Tokens for efficient Multi-Label Class-Incremental Learning

投稿日: 2024年5月27日作成者: jarxiv

要約プロンプトチューニングは、タスク固有のパラメータ (またはプロンプト) … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation

投稿日: 2024年5月27日作成者: jarxiv

要約近年、現実的な生成結果と幅広いパーソナライズされたアプリケーションにより、 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

月別アーカイブ: 2024年5月

PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions

Domain Generalisation for Object Detection under Covariate and Concept Shift

Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

Align as Ideal: Cross-Modal Alignment Binding for Federated Medical Vision-Language Pre-training

LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image

Less is more: Summarizing Patch Tokens for efficient Multi-Label Class-Incremental Learning

Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation

最近の投稿

最近のコメント

アーカイブ

カテゴリー