「cs.CV」カテゴリーアーカイブ

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

投稿日: 2025年3月31日作成者: jarxiv

要約ビデオ大規模な言語モデル（VLLM）は、最近、複雑なビデオコンテンツの処理 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations

投稿日: 2025年3月31日作成者: jarxiv

要約セマンティック対応は、最近の大規模なビジョンモデル（LVM）の進歩を通じて … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba

投稿日: 2025年3月31日作成者: jarxiv

要約イベントカメラは、生物学的システムからインスピレーションを引き出し、最小限 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Adaptive Weighted Parameter Fusion with CLIP for Class-Incremental Learning

投稿日: 2025年3月31日作成者: jarxiv

要約クラスインクリメンテルラーニング（CIL）により、モデルは新しいクラスから … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior

投稿日: 2025年3月31日作成者: jarxiv

要約二分法画像セグメンテーション（DIS）は、高解像度の自然画像の高精度オブジ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

投稿日: 2025年3月31日作成者: jarxiv

要約レーングラフとエージェントの境界ボックスを含む初期トラフィックシーンと閉ル … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models

投稿日: 2025年3月31日作成者: jarxiv

要約大規模拡散モデルのトレーニングアダプターの制御と効率を高めるために設計され … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets

投稿日: 2025年3月31日作成者: jarxiv

要約自己学習学習は、さまざまなドメインのモデルパフォーマンスを改善するために、 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities

投稿日: 2025年3月31日作成者: jarxiv

要約この作業では、2つのコア制約を満たしながら、マルチモーダル生成機能を備えた … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Evaluating the evaluators: Towards human-aligned metrics for missing markers reconstruction

投稿日: 2025年3月31日作成者: jarxiv

要約アニメーションデータは、光学マーカーの位置を確立するために多数のカメラを利 … 続きを読む →

カテゴリー: cs.CV, cs.HC, cs.LG | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations

Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba

Adaptive Weighted Parameter Fusion with CLIP for Class-Incremental Learning

Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models

Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets

Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities

Evaluating the evaluators: Towards human-aligned metrics for missing markers reconstruction

最近の投稿

最近のコメント

アーカイブ

カテゴリー