月別アーカイブ: 2024年4月

RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction

投稿日: 2024年4月1日作成者: jarxiv

要約この論文では、フォトメトリックステレオを通じて取得されたマルチビュー反射 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Learn ‘No’ to Say ‘Yes’ Better: Improving Vision-Language Models via Negations

投稿日: 2024年4月1日作成者: jarxiv

要約既存のビジョン言語モデル (VLM) は、テキストの説明を 1 つの単位と … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LipSim: A Provably Robust Perceptual Similarity Metric

投稿日: 2024年4月1日作成者: jarxiv

要約近年、知覚的類似性指標の開発と適用に対する関心が高まっています。研究では … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Learning to Count without Annotations

投稿日: 2024年4月1日作成者: jarxiv

要約参照ベースのオブジェクト計数のための最近の教師あり手法は、ベンチマークデ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Convolutional Prompting meets Language Models for Continual Learning

投稿日: 2024年4月1日作成者: jarxiv

要約継続学習 (CL) を使用すると、古いタスクからのデータがない場合でも、新 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects

投稿日: 2024年4月1日作成者: jarxiv

要約単眼 3D 検出器は、自動車や小さな物体に対して優れたパフォーマンスを実現 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

投稿日: 2024年4月1日作成者: jarxiv

要約セマンティックセグメンテーションにおける最先端の手法の効率を向上させるに … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning

投稿日: 2024年4月1日作成者: jarxiv

要約大規模なデータセットで事前トレーニングされたモデルをさまざまな下流タスクに … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Language Model Beats Diffusion — Tokenizer is Key to Visual Generation

投稿日: 2024年4月1日作成者: jarxiv

要約大規模言語モデル (LLM) は、言語の生成タスクでは主要なモデルですが、 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM | コメントを受け付けていません

Gromov-Wassertein-like Distances in the Gaussian Mixture Models Space

投稿日: 2024年4月1日作成者: jarxiv

要約グロモフ-ワッサーシュタイン (GW) 距離は、異なる計量空間にわたる分布 … 続きを読む →

カテゴリー: cs.CV, cs.LG, stat.ML | コメントを受け付けていません

月別アーカイブ: 2024年4月

RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction

Learn ‘No’ to Say ‘Yes’ Better: Improving Vision-Language Models via Negations

LipSim: A Provably Robust Perceptual Similarity Metric

Learning to Count without Annotations

Convolutional Prompting meets Language Models for Continual Learning

SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects

SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning

Language Model Beats Diffusion — Tokenizer is Key to Visual Generation

Gromov-Wassertein-like Distances in the Gaussian Mixture Models Space

最近の投稿

最近のコメント

アーカイブ

カテゴリー