「cs.CV」カテゴリーアーカイブ

RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations

投稿日: 2024年12月30日作成者: jarxiv

要約ビジョントランスフォーマー (ViT) の最近の進歩により、グローバル … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ReNeg: Learning Negative Embedding with Reward Guidance

投稿日: 2024年12月30日作成者: jarxiv

要約 Text-to-Image (T2I) 生成アプリケーションでは、ネガティ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models

投稿日: 2024年12月30日作成者: jarxiv

要約ゼロショットのカスタマイズされたビデオ生成は、その大きな応用可能性により大 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Chimera: A Block-Based Neural Architecture Search Framework for Event-Based Object Detection

投稿日: 2024年12月30日作成者: jarxiv

要約イベントベースのカメラは人間の目をシミュレートするセンサーであり、高速堅牢 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues

投稿日: 2024年12月30日作成者: jarxiv

要約 Vision-Language Tracking (VLT) は、視覚的な … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP

投稿日: 2024年12月30日作成者: jarxiv

要約弱教師あり意味セグメンテーション (WSSS) における対照言語画像事前ト … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

投稿日: 2024年12月30日作成者: jarxiv

要約カスタマイズされたビデオ生成は、テキストプロンプトと被験者の参照画像に基 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs

投稿日: 2024年12月30日作成者: jarxiv

要約コンピュータ支援設計 (CAD) は、正確な 2D および 3D モデリン … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

Baichuan-Omni Technical Report

投稿日: 2024年12月30日作成者: jarxiv

要約 GPT-4o の顕著なマルチモーダル機能とインタラクティブなエクスペリエン … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

DLScanner: A parameter space scanner package assisted by deep learning methods

投稿日: 2024年12月30日作成者: jarxiv

要約このペーパーでは、深層学習 (DL) 技術によって強化されたスキャナーパ … 続きを読む →

カテゴリー: cs.CV, hep-ex, hep-ph, hep-th | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations

ReNeg: Learning Negative Embedding with Reward Guidance

VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models

Chimera: A Block-Based Neural Architecture Search Framework for Event-Based Object Detection

Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues

Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs

Baichuan-Omni Technical Report

DLScanner: A parameter space scanner package assisted by deep learning methods

最近の投稿

最近のコメント

アーカイブ

カテゴリー