「cs.CV」カテゴリーアーカイブ

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

投稿日: 2024年11月12日作成者: jarxiv

要約脳の活動がさまざまな刺激にどのように対応するかを理解するプロセスである神経 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

投稿日: 2024年11月12日作成者: jarxiv

要約ピクセル完璧な精度でフォトリアリスティックな画像コンテンツを生成できる拡散 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

投稿日: 2024年11月12日作成者: jarxiv

要約 text-to-image (T2I) モデルは優れた生成機能を示しますが … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition

投稿日: 2024年11月12日作成者: jarxiv

要約私たちの世界はさまざまな行動に満ちており、私たち人間はそれを特定し、理解し … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Edify 3D: Scalable High-Quality 3D Asset Generation

投稿日: 2024年11月12日作成者: jarxiv

要約高品質の 3D アセット生成のために設計された高度なソリューションである … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

Nuremberg Letterbooks: A Multi-Transcriptional Dataset of Early 15th Century Manuscripts for Document Analysis

投稿日: 2024年11月12日作成者: jarxiv

要約文書分析分野のほとんどのデータセットは高度に標準化されたラベルを利用してお … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Lost in Tracking Translation: A Comprehensive Analysis of Visual SLAM in Human-Centered XR and IoT Ecosystems

投稿日: 2024年11月12日作成者: jarxiv

要約追跡アルゴリズムの進歩により、自動運転車の操縦からロボットの誘導、ユーザー … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding

投稿日: 2024年11月12日作成者: jarxiv

要約リモートセンシング画像用の汎用大規模ビジョン言語モデルの開発を促進するた … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models

投稿日: 2024年11月12日作成者: jarxiv

要約高解像度拡散モデルを加速するための新しいオートエンコーダーモデルファミ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models

投稿日: 2024年11月12日作成者: jarxiv

要約人間の動きの合成は、コンピューターアニメーションの基本的なタスクです。 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition

Edify 3D: Scalable High-Quality 3D Asset Generation

Nuremberg Letterbooks: A Multi-Transcriptional Dataset of Early 15th Century Manuscripts for Document Analysis

Lost in Tracking Translation: A Comprehensive Analysis of Visual SLAM in Human-Centered XR and IoT Ecosystems

VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models

FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー