「cs.CV」カテゴリーアーカイブ

OpenLKA: An Open Dataset of Lane Keeping Assist from Recent Car Models under Real-world Driving Conditions

投稿日: 2025年5月15日作成者: jarxiv

要約 Lane Keeping Assist（LKA）は現代の車両で広く採用され … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

FoldNet: Learning Generalizable Closed-Loop Policy for Garment Folding via Keypoint-Driven Asset and Demonstration Synthesis

投稿日: 2025年5月15日作成者: jarxiv

要約衣服の変形性により、ロボット衣服操作タスクのために大量の高品質データを生成 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

AdaWorld: Learning Adaptable World Models with Latent Actions

投稿日: 2025年5月15日作成者: jarxiv

要約世界モデルは、アクション制御された将来の予測を学ぶことを目指しており、イン … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

METDrive: Multi-modal End-to-end Autonomous Driving with Temporal Guidance

投稿日: 2025年5月15日作成者: jarxiv

要約マルチモーダルエンドツーエンドの自律運転は、最近の研究で有望な進歩を示して … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

投稿日: 2025年5月15日作成者: jarxiv

要約近年、拡散モデルは、ビジョン生成から言語モデリングまで、多様なドメイン全体 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Behind Maya: Building a Multilingual Vision Language Model

投稿日: 2025年5月15日作成者: jarxiv

要約最近では、大規模なビジョン言語モデル（VLM）の急速な発展が見られました。 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training

投稿日: 2025年5月15日作成者: jarxiv

要約トレーニング前の標準的な大きな視覚言語モデル（LVLMS）では、モデルは通 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

BioVFM-21M: Benchmarking and Scaling Self-Supervised Vision Foundation Models for Biomedical Image Analysis

投稿日: 2025年5月15日作成者: jarxiv

要約モデルとデータサイズのスケーリングにより、幅広いタスクよりも印象的なパフォ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

DCSNet: A Lightweight Knowledge Distillation-Based Model with Explainable AI for Lung Cancer Diagnosis from Histopathological Images

投稿日: 2025年5月15日作成者: jarxiv

要約肺がんは、生存率を改善するために早期発見と正確な診断が重要である世界的に癌 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Unsupervised Multiview Contrastive Language-Image Joint Learning with Pseudo-Labeled Prompts Via Vision-Language Model for 3D/4D Facial Expression Recognition

投稿日: 2025年5月15日作成者: jarxiv

要約このペーパーでは、3D/4Dデータからの顔の感情の監視されていない対照的な … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

OpenLKA: An Open Dataset of Lane Keeping Assist from Recent Car Models under Real-world Driving Conditions

FoldNet: Learning Generalizable Closed-Loop Policy for Garment Folding via Keypoint-Driven Asset and Demonstration Synthesis

AdaWorld: Learning Adaptable World Models with Latent Actions

METDrive: Multi-modal End-to-end Autonomous Driving with Temporal Guidance

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

Behind Maya: Building a Multilingual Vision Language Model

Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training

BioVFM-21M: Benchmarking and Scaling Self-Supervised Vision Foundation Models for Biomedical Image Analysis

DCSNet: A Lightweight Knowledge Distillation-Based Model with Explainable AI for Lung Cancer Diagnosis from Histopathological Images

Unsupervised Multiview Contrastive Language-Image Joint Learning with Pseudo-Labeled Prompts Via Vision-Language Model for 3D/4D Facial Expression Recognition

最近の投稿

最近のコメント

アーカイブ

カテゴリー