投稿者「jarxiv」のアーカイブ

XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis

投稿日: 2025年5月8日作成者: jarxiv

要約自律運転車の安全性を確保するには、シミュレーションによる自律システムの包括 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

投稿日: 2025年5月8日作成者: jarxiv

要約カスタマイズされたビデオジェネレーションは、柔軟なユーザー定義条件下で特定 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model

投稿日: 2025年5月8日作成者: jarxiv

要約記述的なフリーテキスト入力から3D CTボリュームを生成することは、診断と … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Edge-GPU Based Face Tracking for Face Detection and Recognition Acceleration

投稿日: 2025年5月8日作成者: jarxiv

要約リアルタイムで正確な顔の検出と公共の場所での認識に特化した費用対効果の高い … 続きを読む →

カテゴリー: cs.AR, cs.CV, cs.LG, eess.IV | コメントを受け付けていません

DFVO: Learning Darkness-free Visible and Infrared Image Disentanglement and Fusion All at Once

投稿日: 2025年5月8日作成者: jarxiv

要約可視および赤外線融合は、画像融合の分野で最も重要なタスクの1つであり、高レ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling

投稿日: 2025年5月8日作成者: jarxiv

要約この作業では、ビデオのみで条件付けられた音楽生成を体系的に研究しています。 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM, cs.SD | コメントを受け付けていません

RAFT: Robust Augmentation of FeaTures for Image Segmentation

投稿日: 2025年5月8日作成者: jarxiv

要約画像セグメンテーションは、シーンの理解のための強力なコンピュータービジョン … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Registration of 3D Point Sets Using Exponential-based Similarity Matrix

投稿日: 2025年5月8日作成者: jarxiv

要約ポイントクラウド登録は、コンピュータービジョンとロボット工学の根本的な問題 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation

投稿日: 2025年5月8日作成者: jarxiv

要約 Clipは、大規模な画像テキストペアの対照学習を介して、画像とテキスト機能 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling

投稿日: 2025年5月8日作成者: jarxiv

要約標準的な製品ビューの孤立した衣服の画像と人の別の画像を考えると、仮想トライ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model

Edge-GPU Based Face Tracking for Face Detection and Recognition Acceleration

DFVO: Learning Darkness-free Visible and Infrared Image Disentanglement and Fusion All at Once

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling

RAFT: Robust Augmentation of FeaTures for Image Segmentation

Registration of 3D Point Sets Using Exponential-based Similarity Matrix

LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation

Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling

最近の投稿

最近のコメント

アーカイブ

カテゴリー