「cs.CV」カテゴリーアーカイブ

GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization

投稿日: 2025年3月21日作成者: jarxiv

要約シーン座標の回帰やカメラは回帰をもたらすなど、さまざまな視覚的ローカリゼー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

投稿日: 2025年3月21日作成者: jarxiv

要約 LIDARデータPretrainingは、データ利用を強化するために大規模 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

投稿日: 2025年3月21日作成者: jarxiv

要約ヘルスケアなどの重要なドメインにおける人工知能（AI）への依存度は、特に予 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models

投稿日: 2025年3月21日作成者: jarxiv

要約最近のマルチモーダル大手言語モデル（MLLM）は、大規模なビデオフレームに … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos

投稿日: 2025年3月21日作成者: jarxiv

要約ビデオ理解における印象的な進歩にもかかわらず、ほとんどの努力は粗いまたは視 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG, cs.MM | コメントを受け付けていません

Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers

投稿日: 2025年3月21日作成者: jarxiv

要約変圧器ベースのモデルは、解釈が困難な隠された状態を生成します。この作業で … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation

投稿日: 2025年3月21日作成者: jarxiv

要約スケーリングアーキテクチャは、シーンテキスト認識（STR）の改善に効果的で … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis

投稿日: 2025年3月21日作成者: jarxiv

要約表現学習と生成モデリングは視覚データを理解しようとしていますが、両方のドメ … 続きを読む →

カテゴリー: cs.AI, cs.CV, I.2.10 | コメントを受け付けていません

EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition

投稿日: 2025年3月21日作成者: jarxiv

要約既存のマルチモーダルベースのヒューマンアクション認識アプローチは計算集中的 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility in Autonomous Vehicles

投稿日: 2025年3月21日作成者: jarxiv

要約自動運転車（AVS）は、プライバシーを維持しながら知覚モデルを強化するため … 続きを読む →

カテゴリー: cs.CV, cs.DC, cs.ET, cs.LG | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models

LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos

Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers

Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation

Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis

EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition

RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility in Autonomous Vehicles

最近の投稿

最近のコメント

アーカイブ

カテゴリー