月別アーカイブ: 2024年5月

Text-to-Vector Generation with Neural Path Representation

投稿日: 2024年5月21日作成者: jarxiv

要約ベクターグラフィックスはデジタルアートで広く使用されており、そのスケー … 続きを読む →

カテゴリー: cs.CV, cs.GR | コメントを受け付けていません

Enhancing Explainable AI: A Hybrid Approach Combining GradCAM and LRP for CNN Interpretability

投稿日: 2024年5月21日作成者: jarxiv

要約 GradCAM と LRP 手法の組み合わせを使用して、CNN ベースのモ … 続きを読む →

カテゴリー: cs.CV, I.4.0 | コメントを受け付けていません

Continual Learning of Diffusion Models with Generative Distillation

投稿日: 2024年5月21日作成者: jarxiv

要約拡散モデルは、画像合成において最先端のパフォーマンスを実現する強力な生成モ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

FashionEngine: Interactive 3D Human Generation and Editing via Multimodal Controls

投稿日: 2024年5月21日作成者: jarxiv

要約私たちは、自然言語、視覚認識、手描きスケッチなどのユーザーフレンドリーなマ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multi-View Attentive Contextualization for Multi-View 3D Object Detection

投稿日: 2024年5月21日作成者: jarxiv

要約クエリベースのマルチビュー 3D (MV3D) オブジェクト検出における … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution

投稿日: 2024年5月21日作成者: jarxiv

要約この研究では、科学データの解像度を向上させるための任意スケールの超解像 ( … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

投稿日: 2024年5月21日作成者: jarxiv

要約 Text-to-image (T2I) 拡散モデルは、画像の合成と編集にお … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning

投稿日: 2024年5月21日作成者: jarxiv

要約最近の研究では、大規模マルチモーダルモデル (LMM) が自然分布の変化 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

投稿日: 2024年5月21日作成者: jarxiv

要約我々は、見えないシーンを効率的に再構築できる、マルチビューステレオ (M … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Images that Sound: Composing Images and Sounds on a Single Canvas

投稿日: 2024年5月21日作成者: jarxiv

要約スペクトログラムは、私たちの視覚世界にある画像とは大きく異なるサウンドの … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

月別アーカイブ: 2024年5月

Text-to-Vector Generation with Neural Path Representation

Enhancing Explainable AI: A Hybrid Approach Combining GradCAM and LRP for CNN Interpretability

Continual Learning of Diffusion Models with Generative Distillation

FashionEngine: Interactive 3D Human Generation and Editing via Multimodal Controls

Multi-View Attentive Contextualization for Multi-View 3D Object Detection

Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution

Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning

Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

Images that Sound: Composing Images and Sounds on a Single Canvas

最近の投稿

最近のコメント

アーカイブ

カテゴリー