「cs.AI」カテゴリーアーカイブ

Exploring Perceptual Limitation of Multimodal Large Language Models

投稿日: 2024年2月13日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は最近、視覚的な質問に答える … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

StyleLipSync: Style-based Personalized Lip-sync Video Generation

投稿日: 2024年2月13日作成者: jarxiv

要約この論文では、任意のオーディオからアイデンティティに依存しないリップシンク … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

TriAug: Out-of-Distribution Detection for Robust Classification of Imbalanced Breast Lesion in Ultrasound

投稿日: 2024年2月13日作成者: jarxiv

要約乳房病変の組織学的サブタイプなど、さまざまな病気の発生率は大きく異なります … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

投稿日: 2024年2月13日作成者: jarxiv

要約セマンティックセグメンテーションにおける最先端の手法の効率を向上させるに … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models

投稿日: 2024年2月13日作成者: jarxiv

要約ラージビジョン言語モデル (LVLM) の最近の進歩により、人間の言語に … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

AYDIV: Adaptable Yielding 3D Object Detection via Integrated Contextual Vision Transformer

投稿日: 2024年2月13日作成者: jarxiv

要約 LiDAR とカメラのデータを組み合わせることで、自動運転システムにおける … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

PBADet: A One-Stage Anchor-Free Approach for Part-Body Association

投稿日: 2024年2月13日作成者: jarxiv

要約人間の部分 (手、顔など) を検出し、それらを個人と正しく関連付けることは … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

投稿日: 2024年2月13日作成者: jarxiv

要約視覚条件付き言語モデル (VLM) は、視覚的な対話、シーンの理解、ロボッ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Detection of Spider Mites on Labrador Beans through Machine Learning Approaches Using Custom Datasets

投稿日: 2024年2月13日作成者: jarxiv

要約食糧生産の需要が高まる中、作物を守るためには植物の病気を早期に検出すること … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

投稿日: 2024年2月13日作成者: jarxiv

要約我々は、連続的な意思決定タスクにおける少数ショットのポリシー学習効率を向上 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Exploring Perceptual Limitation of Multimodal Large Language Models

StyleLipSync: Style-based Personalized Lip-sync Video Generation

TriAug: Out-of-Distribution Detection for Robust Classification of Imbalanced Breast Lesion in Ultrasound

SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models

AYDIV: Adaptable Yielding 3D Object Detection via Integrated Contextual Vision Transformer

PBADet: A One-Stage Anchor-Free Approach for Part-Body Association

Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

Detection of Spider Mites on Labrador Beans through Machine Learning Approaches Using Custom Datasets

Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

最近の投稿

最近のコメント

アーカイブ

カテゴリー