「cs.CV」カテゴリーアーカイブ

Visual Product Graph: Bridging Visual Products And Composite Images For End-to-End Style Recommendations

投稿日: 2025年5月28日作成者: jarxiv

要約意味的に類似しているが視覚的に異なるコンテンツを取得することは、視覚検索シ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

投稿日: 2025年5月28日作成者: jarxiv

要約アクティブな知覚としても知られるアクティブビジョンは、タスク関連の情報を収 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

LazyVLM: Neuro-Symbolic Approach to Video Analytics

投稿日: 2025年5月28日作成者: jarxiv

要約現在のビデオ分析アプローチは、柔軟性と効率性の基本的なトレードオフに直面し … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.DB, cs.IR, cs.MM | コメントを受け付けていません

ID-Align: RoPE-Conscious Position Remapping for Dynamic High-Resolution Adaptation in Vision-Language Models

投稿日: 2025年5月28日作成者: jarxiv

要約現在、ビジョン言語モデル（VLMS）パフォーマンスを強化するための一般的な … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Bringing Objects to Life: training-free 4D generation from 3D objects through view consistent noise

投稿日: 2025年5月28日作成者: jarxiv

要約生成モデルの最近の進歩により、仮想世界、メディア、およびゲームのアプリケー … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

When Are Concepts Erased From Diffusion Models?

投稿日: 2025年5月28日作成者: jarxiv

要約モデルが特定の概念を生成するのを選択的に防止する能力である概念消去は、関心 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction

投稿日: 2025年5月28日作成者: jarxiv

要約このホワイトペーパーでは、新しい次のデテール予測戦略を介して画像をモデル化 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration

投稿日: 2025年5月28日作成者: jarxiv

要約大きなビジョン言語モデル（LVLMS）は、マルチモーダルタスクで印象的なパ … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Policy Optimized Text-to-Image Pipeline Design

投稿日: 2025年5月28日作成者: jarxiv

要約テキストからイメージの生成は、単一のモノリシックモデルを超えて複雑なマルチ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation

投稿日: 2025年5月28日作成者: jarxiv

要約オブジェクトコンポジットは、拡張現実（AR）と具体化されたインテリジェンス … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Visual Product Graph: Bridging Visual Products And Composite Images For End-to-End Style Recommendations

Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

LazyVLM: Neuro-Symbolic Approach to Video Analytics

ID-Align: RoPE-Conscious Position Remapping for Dynamic High-Resolution Adaptation in Vision-Language Models

Bringing Objects to Life: training-free 4D generation from 3D objects through view consistent noise

When Are Concepts Erased From Diffusion Models?

DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction

Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration

Policy Optimized Text-to-Image Pipeline Design

MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation

最近の投稿

最近のコメント

アーカイブ

カテゴリー