「cs.CV」カテゴリーアーカイブ

OmniCaptioner: One Captioner to Rule Them All

投稿日: 2025年4月10日作成者: jarxiv

要約 Omnicaptionerを提案します。これは、さまざまな視覚ドメインにわ … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Are We Done with Object-Centric Learning?

投稿日: 2025年4月10日作成者: jarxiv

要約オブジェクト中心の学習（OCL）は、シーン内の他のオブジェクトまたは背景キ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution

投稿日: 2025年4月10日作成者: jarxiv

要約汎用性の高いビデオ深度推定モデルは、（1）フレーム間で正確で一貫性があり、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Privacy Attacks on Image AutoRegressive Models

投稿日: 2025年4月10日作成者: jarxiv

要約画像の自己回帰生成は、画像の自己回帰モデル（IAR）が画像品質（FID：1 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation

投稿日: 2025年4月10日作成者: jarxiv

要約パラメーター効率の高い微調整（PEFT）は、固有の機能を維持および解き放ち … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DuoSpaceNet: Leveraging Both Bird’s-Eye-View and Perspective View Representations for 3D Object Detection

投稿日: 2025年4月9日作成者: jarxiv

要約マルチビューカメラのみの3Dオブジェクトの検出は、主に2つの主要なパラダイ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

EP-Diffuser: An Efficient Diffusion Model for Traffic Scene Generation and Prediction via Polynomial Representations

投稿日: 2025年4月9日作成者: jarxiv

要約予測の地平線が増加すると、エージェントの動きのマルチモーダルの性質により、 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions — An EndoVis’24 Challenge

投稿日: 2025年4月9日作成者: jarxiv

要約外科的データサイエンスは、外科的ビデオ分析のためのエンドツーエンドのディー … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

ProtoGS: Efficient and High-Quality Rendering with 3D Gaussian Prototypes

投稿日: 2025年4月9日作成者: jarxiv

要約 3Dガウススプラッティング（3DG）は、新しいビューの合成に大きな進歩を遂 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

AVP-AP: Self-supervised Automatic View Positioning in 3D cardiac CT via Atlas Prompting

投稿日: 2025年4月9日作成者: jarxiv

要約自動ビューのポジショニングは、疾患診断や外科的計画を含む、心臓コンピュ … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

OmniCaptioner: One Captioner to Rule Them All

Are We Done with Object-Centric Learning?

FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution

Privacy Attacks on Image AutoRegressive Models

Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation

DuoSpaceNet: Leveraging Both Bird’s-Eye-View and Perspective View Representations for 3D Object Detection

EP-Diffuser: An Efficient Diffusion Model for Traffic Scene Generation and Prediction via Polynomial Representations

SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions — An EndoVis’24 Challenge

ProtoGS: Efficient and High-Quality Rendering with 3D Gaussian Prototypes

AVP-AP: Self-supervised Automatic View Positioning in 3D cardiac CT via Atlas Prompting

最近の投稿

最近のコメント

アーカイブ

カテゴリー