「cs.CV」カテゴリーアーカイブ

Keypoint Abstraction using Large Models for Object-Relative Imitation Learning

投稿日: 2024年10月31日作成者: jarxiv

要約多様なタスクや環境にわたる新しいオブジェクト構成やインスタンスへの一般化は … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

DisC-GS: Discontinuity-aware Gaussian Splatting

投稿日: 2024年10月31日作成者: jarxiv

要約最近、ガウス分布の集合として 3D シーンを表現する手法であるガウススプ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

EMMA: End-to-End Multimodal Model for Autonomous Driving

投稿日: 2024年10月31日作成者: jarxiv

要約自動運転のためのエンドツーエンドのマルチモーダルモデルであるEMMAを紹介 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Certified Robustness to Data Poisoning in Gradient-Based Training

投稿日: 2024年10月31日作成者: jarxiv

要約最新の機械学習パイプラインは大量の公開データを活用しているため、データの品 … 続きを読む →

カテゴリー: cs.CR, cs.CV, cs.LG | コメントを受け付けていません

TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models

投稿日: 2024年10月31日作成者: jarxiv

要約既存のベンチマークでは、ビデオ理解のための時間的コンテキストを活用する際に … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Multi-student Diffusion Distillation for Better One-step Generators

投稿日: 2024年10月31日作成者: jarxiv

要約拡散モデルは、長時間にわたる複数ステップの推論手順を犠牲にして、高品質のサ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

投稿日: 2024年10月31日作成者: jarxiv

要約人間には、一般的な世界の動きの遅い学習と、新しい経験からのエピソード記憶の … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

OpenSatMap: A Fine-grained High-resolution Satellite Dataset for Large-scale Map Construction

投稿日: 2024年10月31日作成者: jarxiv

要約この論文では、大規模な地図構築のためのきめの細かい高解像度の衛星データセッ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

RelationBooth: Towards Relation-Aware Customized Object Generation

投稿日: 2024年10月31日作成者: jarxiv

要約カスタマイズされた画像の生成は、ユーザーが提供する画像プロンプトに基づいて … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

投稿日: 2024年10月31日作成者: jarxiv

要約自然言語を通じて説明できるビデオ内の幅広い概念をセグメント化するためのフレ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Keypoint Abstraction using Large Models for Object-Relative Imitation Learning

DisC-GS: Discontinuity-aware Gaussian Splatting

EMMA: End-to-End Multimodal Model for Autonomous Driving

Certified Robustness to Data Poisoning in Gradient-Based Training

TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models

Multi-student Diffusion Distillation for Better One-step Generators

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

OpenSatMap: A Fine-grained High-resolution Satellite Dataset for Large-scale Map Construction

RelationBooth: Towards Relation-Aware Customized Object Generation

ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

最近の投稿

最近のコメント

アーカイブ

カテゴリー