月別アーカイブ: 2023年3月

National-scale 1-m resolution land-cover mapping for the entire China based on a low-cost solution and open-access data

投稿日: 2023年3月10日作成者: jarxiv

要約現在、多くの大規模な土地被覆 (LC) 製品がリリースされていますが、現在 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

SpyroPose: Importance Sampling Pyramids for Object Pose Distribution Estimation in SE(3)

投稿日: 2023年3月10日作成者: jarxiv

要約オブジェクトの姿勢推定は、コンピュータービジョンの中核的な問題であり、多 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

投稿日: 2023年3月10日作成者: jarxiv

要約マルチメディア通信は、人々の間のグローバルな相互作用を促進します。しかし … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

3D Video Loops from Asynchronous Input

投稿日: 2023年3月10日作成者: jarxiv

要約ループビデオは、目に見える継ぎ目やアーティファクトなしで無限にループでき … 続きを読む →

カテゴリー: cs.CV, cs.GR | コメントを受け付けていません

Replacement as a Self-supervision for Fine-grained Vision-language Pre-training

投稿日: 2023年3月10日作成者: jarxiv

要約オブジェクトアノテーションに基づくきめの細かい監視は、視覚と言語の事前ト … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

WASD: A Wilder Active Speaker Detection Dataset

投稿日: 2023年3月10日作成者: jarxiv

要約現在のアクティブスピーカー検出 (ASD) モデルは、音声と顔の特徴のみ … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS, eess.IV | コメントを受け付けていません

Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE

投稿日: 2023年3月10日作成者: jarxiv

要約ビデオは、時間の経過に伴う複雑な動的システムの変化を離散画像シーケンスの形 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset

投稿日: 2023年3月10日作成者: jarxiv

要約過去 10 年間で深層学習ベースのベンガル語光学式文字認識 (OCR) に … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds

投稿日: 2023年3月10日作成者: jarxiv

要約マスクされた自動エンコードは、テキスト、画像、および最近では点群の Tra … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Tucker Bilinear Attention Network for Multi-scale Remote Sensing Object Detection

投稿日: 2023年3月10日作成者: jarxiv

要約 VHR リモートセンシング画像でのオブジェクト検出は、都市計画、土地資源 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2023年3月

National-scale 1-m resolution land-cover mapping for the entire China based on a low-cost solution and open-access data

SpyroPose: Importance Sampling Pyramids for Object Pose Distribution Estimation in SE(3)

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

3D Video Loops from Asynchronous Input

Replacement as a Self-supervision for Fine-grained Vision-language Pre-training

WASD: A Wilder Active Speaker Detection Dataset

Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE

BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset

Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds

Tucker Bilinear Attention Network for Multi-scale Remote Sensing Object Detection

最近の投稿

最近のコメント

アーカイブ

カテゴリー