月別アーカイブ: 2024年8月

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

投稿日: 2024年8月7日作成者: jarxiv

要約大規模なマルチモダリティデータセットは、大規模なビデオ言語モデルの成功を … 続きを読む →

カテゴリー: cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

EMO: Emote Portrait Alive — Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

投稿日: 2024年8月7日作成者: jarxiv

要約この研究では、オーディオキューと顔の動きの間の動的かつ微妙な関係に焦点を … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection

投稿日: 2024年8月7日作成者: jarxiv

要約表面欠陥検出の目的は、捕捉した物体の表面上の異常な領域を特定して位置を特定 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization

投稿日: 2024年8月7日作成者: jarxiv

要約マルチメディアデータの急速な増加により、テキストと関連画像の両方を統合し … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX

投稿日: 2024年8月7日作成者: jarxiv

要約入力ピクセルの関連性に基づく畳み込みニューラルネットワーク (CNN) … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

DiffX: Guide Your Layout to Cross-Modal Generative Modeling

投稿日: 2024年8月7日作成者: jarxiv

要約拡散モデルは、言語主導およびレイアウト主導の画像生成において大きな進歩を遂 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models

投稿日: 2024年8月7日作成者: jarxiv

要約画像生成AIは近年大きな注目を集めています。特に、最近の生成AIの中核を … 続きを読む →

カテゴリー: cs.CV, cs.LG, physics.med-ph | コメントを受け付けていません

Dilated Convolution with Learnable Spacings makes visual models more aligned with humans: a Grad-CAM study

投稿日: 2024年8月7日作成者: jarxiv

要約 Dirated Convolution with Learningable … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

IMAGDressing-v1: Customizable Virtual Dressing

投稿日: 2024年8月7日作成者: jarxiv

要約最新の進歩により、潜在拡散モデルを使用したローカライズされた衣類修復を通じ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SimEndoGS: Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

投稿日: 2024年8月7日作成者: jarxiv

要約手術シーンのシミュレーションは、外科教育やシミュレーターベースのロボット学 … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.RO | コメントを受け付けていません

月別アーカイブ: 2024年8月

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

EMO: Emote Portrait Alive — Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection

Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization

When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX

DiffX: Guide Your Layout to Cross-Modal Generative Modeling

Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models

Dilated Convolution with Learnable Spacings makes visual models more aligned with humans: a Grad-CAM study

IMAGDressing-v1: Customizable Virtual Dressing

SimEndoGS: Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

最近の投稿

最近のコメント

アーカイブ

カテゴリー