月別アーカイブ: 2023年5月

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering

投稿日: 2023年5月25日作成者: jarxiv

要約この論文では、重要な臨床関連情報を含む医用画像を効率的に解釈する上で重要な … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Rethinking the Evaluation Protocol of Domain Generalization

投稿日: 2023年5月25日作成者: jarxiv

要約ドメインの一般化は、複数のトレーニングドメインから学習した共通の知識を活 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

PopulAtion Parameter Averaging (PAPA)

投稿日: 2023年5月25日作成者: jarxiv

要約アンサンブル手法は複数のモデルの予測を組み合わせてパフォーマンスを向上させ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Reversible Graph Neural Network-based Reaction Distribution Learning for Multiple Appropriate Facial Reactions Generation

投稿日: 2023年5月25日作成者: jarxiv

要約人間と人間の二者関係における顔の反応の生成は複雑であり、話者の行動には複数 … 続きを読む →

カテゴリー: 68T40, cs.CV | コメントを受け付けていません

ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers

投稿日: 2023年5月25日作成者: jarxiv

要約最近、プレーンビジョントランスフォーマー (ViT) は、強力なモデリ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Natural Language

投稿日: 2023年5月25日作成者: jarxiv

要約最近の研究では、自然言語を使用して 3D 形状を生成および編集できることが … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

High Speed Human Action Recognition using a Photonic Reservoir Computer

投稿日: 2023年5月25日作成者: jarxiv

要約ビデオ内の人間の動作の認識は、コンピュータービジョンの最も活発な研究分野 … 続きを読む →

カテゴリー: cs.CV, cs.ET, physics.optics | コメントを受け付けていません

Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective

投稿日: 2023年5月25日作成者: jarxiv

要約医用画像のセグメンテーションでは、意味的に類似したサンプルと異なるサンプル … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, eess.IV | コメントを受け付けていません

ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors

投稿日: 2023年5月25日作成者: jarxiv

要約画像の修復には、破損した画像の欠落領域を埋めることが含まれます。最近目覚 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

投稿日: 2023年5月25日作成者: jarxiv

要約テキストから画像への拡散モデル (DM) の最近の人気は、DM がユーザー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

月別アーカイブ: 2023年5月

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering

Rethinking the Evaluation Protocol of Domain Generalization

PopulAtion Parameter Averaging (PAPA)

Reversible Graph Neural Network-based Reaction Distribution Learning for Multiple Appropriate Facial Reactions Generation

ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers

CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Natural Language

High Speed Human Action Recognition using a Photonic Reservoir Computer

Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective

ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

最近の投稿

最近のコメント

アーカイブ

カテゴリー