月別アーカイブ: 2025年3月

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

投稿日: 2025年3月17日作成者: jarxiv

要約最先端の変圧器ベースの大規模マルチモーダルモデル（LMMS）は、因果的自己 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Towards Few-Call Model Stealing via Active Self-Paced Knowledge Distillation and Diffusion-Based Image Generation

投稿日: 2025年3月17日作成者: jarxiv

要約拡散モデルは、画像合成の強力な機能を示しており、多くのコンピュータービジョ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Pathology Image Compression with Pre-trained Autoencoders

投稿日: 2025年3月17日作成者: jarxiv

要約デジタル組織病理学の高解像度全体のスライド画像の量が増えているため、重要な … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information

投稿日: 2025年3月17日作成者: jarxiv

要約 We present a novel framework for enha … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages

投稿日: 2025年3月17日作成者: jarxiv

要約 An old-school recipe for training a c … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM | コメントを受け付けていません

TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing

投稿日: 2025年3月17日作成者: jarxiv

要約 Treemeshgptを紹介します。Treemeshgptは、入力ポイント … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.MM | コメントを受け付けていません

Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation

投稿日: 2025年3月17日作成者: jarxiv

要約透明なオブジェクトは日常生活で一般的であり、透明な表面とその背後にあるオブ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Filter, Correlate, Compress: Training-Free Token Reduction for MLLM Acceleration

投稿日: 2025年3月17日作成者: jarxiv

要約シーケンスの長さに関するマルチモーダル大手言語モデル（MLLM）の2次複雑 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

投稿日: 2025年3月17日作成者: jarxiv

要約カメラ制御は、テキストまたは画像条件付けられたビデオ生成タスクで積極的に研 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Centaur: Robust End-to-End Autonomous Driving with Test-Time Training

投稿日: 2025年3月17日作成者: jarxiv

要約展開中にエンドツーエンドの自動運転車の複雑な意思決定システムにどのように依 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

月別アーカイブ: 2025年3月

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Towards Few-Call Model Stealing via Active Self-Paced Knowledge Distillation and Diffusion-Based Image Generation

Pathology Image Compression with Pre-trained Autoencoders

Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information

Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages

TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing

Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation

Filter, Correlate, Compress: Training-Free Token Reduction for MLLM Acceleration

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Centaur: Robust End-to-End Autonomous Driving with Test-Time Training

最近の投稿

最近のコメント

アーカイブ

カテゴリー