「cs.CV」カテゴリーアーカイブ

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

投稿日: 2025年1月7日作成者: jarxiv

要約 Transformer アーキテクチャを備えた潜在拡散モデルは、高忠実度の … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

A Novel Structure-Agnostic Multi-Objective Approach for Weight-Sharing Compression in Deep Neural Networks

投稿日: 2025年1月7日作成者: jarxiv

要約ディープニューラルネットワークは、トレーニング後に数百万、数十億の重み … 続きを読む →

カテゴリー: cs.CV, cs.NE | コメントを受け付けていません

MVP: Multimodal Emotion Recognition based on Video and Physiological Signals

投稿日: 2025年1月7日作成者: jarxiv

要約人間の感情には、行動、生理学的、認知の複雑な変化が伴います。現在の最先端 … 続きを読む →

カテゴリー: 68T05, 68T10, cs.CV, I.5 | コメントを受け付けていません

Task-Agnostic Federated Learning

投稿日: 2025年1月7日作成者: jarxiv

要約医療画像の分野では、さまざまな機関からの大規模なデータセットを活用すること … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.DC | コメントを受け付けていません

ETO:Efficient Transformer-based Local Feature Matching by Organizing Multiple Homography Hypotheses

投稿日: 2025年1月7日作成者: jarxiv

要約局所特徴マッチングの学習効率の問題に取り組みます。最近の進歩により、純粋 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CAT: Content-Adaptive Image Tokenization

投稿日: 2025年1月7日作成者: jarxiv

要約既存の画像トークナイザーのほとんどは、画像を固定数のトークンまたはパッチに … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Normalizing Batch Normalization for Long-Tailed Recognition

投稿日: 2025年1月7日作成者: jarxiv

要約実際のシナリオでは、クラス全体のトレーニングサンプルの数は通常、ロングテ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SCRREAM : SCan, Register, REnder And Map:A Framework for Annotating Accurate and Dense 3D Indoor Scenes with a Benchmark

投稿日: 2025年1月7日作成者: jarxiv

要約従来、3D 屋内データセットは一般的に、一般化を向上させるために、グラウン … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Geometry Restoration and Dewarping of Camera-Captured Document Images

投稿日: 2025年1月7日作成者: jarxiv

要約この研究は、検出、セグメンテーション、ジオメトリ復元、歪み補正のアルゴリズ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches

投稿日: 2025年1月7日作成者: jarxiv

要約視覚言語モデル、大規模言語モデル (LLM)、拡散モデル、視覚言語行動 ( … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

A Novel Structure-Agnostic Multi-Objective Approach for Weight-Sharing Compression in Deep Neural Networks

MVP: Multimodal Emotion Recognition based on Video and Physiological Signals

Task-Agnostic Federated Learning

ETO:Efficient Transformer-based Local Feature Matching by Organizing Multiple Homography Hypotheses

CAT: Content-Adaptive Image Tokenization

Normalizing Batch Normalization for Long-Tailed Recognition

SCRREAM : SCan, Register, REnder And Map:A Framework for Annotating Accurate and Dense 3D Indoor Scenes with a Benchmark

Geometry Restoration and Dewarping of Camera-Captured Document Images

Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches

最近の投稿

最近のコメント

アーカイブ

カテゴリー