月別アーカイブ: 2024年7月

DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models

投稿日: 2024年7月24日作成者: jarxiv

要約画像ベースの 3D 仮想試着 (VTON) は、人物や衣服の画像に従って … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?

投稿日: 2024年7月24日作成者: jarxiv

要約この論文では、包括的な研究を紹介し、4D および/または 3D テンソルの … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Laplacian Segmentation Networks Improve Epistemic Uncertainty Quantification

投稿日: 2024年7月24日作成者: jarxiv

要約画像のセグメンテーションは、特に分布外 (OOD) 画像に対して予測を行う … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Defending Our Privacy With Backdoors

投稿日: 2024年7月24日作成者: jarxiv

要約厳選されていない、多くの場合機密性の高い Web スクレイピングデータに … 続きを読む →

カテゴリー: cs.CL, cs.CR, cs.CV, cs.LG | コメントを受け付けていません

Imperfect Vision Encoders: Efficient and Robust Tuning for Vision-Language Models

投稿日: 2024年7月24日作成者: jarxiv

要約ビジョン言語モデル (VLM) は、視覚的な質問応答と画像キャプションの優 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Position: AI/ML Influencers Have a Place in the Academic Process

投稿日: 2024年7月24日作成者: jarxiv

要約 AI および ML のカンファレンスで採択された論文の数が数千件に達するに … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.DL, cs.LG, cs.SI | コメントを受け付けていません

A Diffusion Model for Simulation Ready Coronary Anatomy with Morpho-skeletal Control

投稿日: 2024年7月24日作成者: jarxiv

要約仮想介入により、冠状動脈内でのデバイス展開の物理ベースのシミュレーションが … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

QPT V2: Masked Image Modeling Advances Visual Scoring

投稿日: 2024年7月24日作成者: jarxiv

要約品質評価と美的評価は、視覚コンテンツの知覚された品質と美的感覚を評価するこ … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

End-to-End Video Question Answering with Frame Scoring Mechanisms and Adaptive Sampling

投稿日: 2024年7月24日作成者: jarxiv

要約 Video Question Answering (VideoQA) は、 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues

投稿日: 2024年7月24日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は、ビデオ内の視覚、音響、言 … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

月別アーカイブ: 2024年7月

DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models

Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?

Laplacian Segmentation Networks Improve Epistemic Uncertainty Quantification

Defending Our Privacy With Backdoors

Imperfect Vision Encoders: Efficient and Robust Tuning for Vision-Language Models

Position: AI/ML Influencers Have a Place in the Academic Process

A Diffusion Model for Simulation Ready Coronary Anatomy with Morpho-skeletal Control

QPT V2: Masked Image Modeling Advances Visual Scoring

End-to-End Video Question Answering with Frame Scoring Mechanisms and Adaptive Sampling

MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues

最近の投稿

最近のコメント

アーカイブ

カテゴリー