月別アーカイブ: 2025年1月

NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields

投稿日: 2025年1月29日作成者: jarxiv

要約サウンドは、人間の知覚において大きな役割を果たします。ビジョンに加えて、 … 続きを読む →

カテゴリー: cs.CV, cs.SD, eess.AS | コメントを受け付けていません

Text-to-Image Generation for Vocabulary Learning Using the Keyword Method

投稿日: 2025年1月29日作成者: jarxiv

要約「キーワード方法」は、外国語の語彙を学ぶための効果的な手法です。それには … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.HC, cs.LG | コメントを受け付けていません

Scenario Understanding of Traffic Scenes Through Large Visual Language Models

投稿日: 2025年1月29日作成者: jarxiv

要約自律運転、知覚、計画、および制御を包含する深い学習モデルは、高性能を達成す … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes

投稿日: 2025年1月29日作成者: jarxiv

要約 Dino、Sam、ClipなどのVision Foundationモデルの … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

IC-Portrait: In-Context Matching for View-Consistent Personalized Portrait

投稿日: 2025年1月29日作成者: jarxiv

要約既存の拡散モデルは、アイデンティティを提供する生成の大きな可能性を示してい … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A Hybrid Deep Learning CNN Model for Enhanced COVID-19 Detection from Computed Tomography (CT) Scan Images

投稿日: 2025年1月29日作成者: jarxiv

要約 Covid-19の早期発見は、効果的な治療とその拡散を制御するために重要で … 続きを読む →

カテゴリー: cs.AI, cs.CV, eess.IV | コメントを受け付けていません

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

投稿日: 2025年1月29日作成者: jarxiv

要約監視された微調整（SFT）および強化学習（RL）は、基礎モデルのトレーニン … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation

投稿日: 2025年1月29日作成者: jarxiv

要約テキストプロンプトまたは画像から360 {\ deg}パノラマを生成するた … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Distilling foundation models for robust and efficient models in digital pathology

投稿日: 2025年1月29日作成者: jarxiv

要約近年、デジタル病理のための基礎モデル（FM）の出現は、トレーニング前のデー … 続きを読む →

カテゴリー: 68T45, cs.CV, I.4.9 | コメントを受け付けていません

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

投稿日: 2025年1月29日作成者: jarxiv

要約この論文では、空間的理解はロボット操作のキーポイントであると主張し、ロボッ … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

月別アーカイブ: 2025年1月

NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields

Text-to-Image Generation for Vocabulary Learning Using the Keyword Method

Scenario Understanding of Traffic Scenes Through Large Visual Language Models

LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes

IC-Portrait: In-Context Matching for View-Consistent Personalized Portrait

A Hybrid Deep Learning CNN Model for Enhanced COVID-19 Detection from Computed Tomography (CT) Scan Images

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation

Distilling foundation models for robust and efficient models in digital pathology

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

最近の投稿

最近のコメント

アーカイブ

カテゴリー