「cs.CV」カテゴリーアーカイブ

BHViT: Binarized Hybrid Vision Transformer

投稿日: 2025年3月7日作成者: jarxiv

要約モデルのバイナリゼーションは、畳み込みニューラルネットワーク（CNN）のリ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant

投稿日: 2025年3月7日作成者: jarxiv

要約一人称ビデオアシスタントは、オンラインビデオの対話を通じて私たちの日常生活 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms

投稿日: 2025年3月7日作成者: jarxiv

要約 Dongbaの絵文字は、世界でまだ使用されている唯一の絵文字です。それら … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection

投稿日: 2025年3月7日作成者: jarxiv

要約人間は、オブジェクトが条件付けられた物理的知識に基づいて知覚、相互作用、お … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Reasoning to Attend: Try to Understand How Token Works

投稿日: 2025年3月7日作成者: jarxiv

要約現在の大規模なマルチモーダルモデル（LMMS）は、視覚言語モデル（LLAV … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-scale Reinforcement Learning in Autonomous Driving

投稿日: 2025年3月6日作成者: jarxiv

要約軌道計画は、自律的な運転に不可欠であり、複雑な環境での安全で効率的なナビゲ … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Floorplan-SLAM: A Real-Time, High-Accuracy, and Long-Term Multi-Session Point-Plane SLAM for Efficient Floorplan Reconstruction

投稿日: 2025年3月6日作成者: jarxiv

要約フロアプランの再構築は、信頼できる屋内ロボットナビゲーションと高レベルのシ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Trajectory Prediction for Autonomous Driving: Progress, Limitations, and Future Directions

投稿日: 2025年3月6日作成者: jarxiv

要約自動運転車が最新の交通システムに大規模に統合される可能性が成長し続けるため … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models

投稿日: 2025年3月6日作成者: jarxiv

要約 AIGCの時代には、拡散モデルの低予算またはデバイス上のアプリケーションの … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation

投稿日: 2025年3月6日作成者: jarxiv

要約参照ビデオオブジェクトセグメンテーションは、自然言語プロンプトを使用してビ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

BHViT: Binarized Hybrid Vision Transformer

LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant

DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms

Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection

Reasoning to Attend: Try to Understand How Token Works

CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-scale Reinforcement Learning in Autonomous Driving

Floorplan-SLAM: A Real-Time, High-Accuracy, and Long-Term Multi-Session Point-Plane SLAM for Efficient Floorplan Reconstruction

Trajectory Prediction for Autonomous Driving: Progress, Limitations, and Future Directions

LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models

Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation

最近の投稿

最近のコメント

アーカイブ

カテゴリー