「cs.AI」カテゴリーアーカイブ

Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning

投稿日: 2024年9月2日作成者: jarxiv

要約最近、Large Vision-Language Model (LVLM) … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Abstracted Gaussian Prototypes for One-Shot Concept Learning

投稿日: 2024年9月2日作成者: jarxiv

要約オムニグロットチャレンジからインスピレーションを得たワンショット学習に基 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

投稿日: 2024年9月2日作成者: jarxiv

要約基礎モデルは、時系列予測 (TSF) における有望なアプローチとして浮上し … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

投稿日: 2024年9月2日作成者: jarxiv

要約大規模マルチモーダルモデル (LMM) の最近の評価では、さまざまな領域 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation

投稿日: 2024年9月2日作成者: jarxiv

要約深度推定は、自動運転において 3D シーンを正確に解釈するために重要です。 … 続きを読む →

カテゴリー: cs.AI, cs.CV, eess.SP | コメントを受け付けていません

Investigating Neuron Ablation in Attention Heads: The Case for Peak Activation Centering

投稿日: 2024年9月2日作成者: jarxiv

要約変圧器ベースのモデルの使用は社会全体で急速に増加しています。この成長に伴 … 続きを読む →

カテゴリー: (Primary), 68T50, cs.AI, cs.CL, cs.CV, cs.LG, I.2.4 | コメントを受け付けていません

A Permuted Autoregressive Approach to Word-Level Recognition for Urdu Digital Text

投稿日: 2024年9月2日作成者: jarxiv

要約この研究論文では、デジタルウルドゥー語テキスト向けに特別に設計された新し … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Open-vocabulary Temporal Action Localization using VLMs

投稿日: 2024年9月2日作成者: jarxiv

要約ビデオアクションのローカリゼーションは、長いビデオから特定のアクションの … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane

投稿日: 2024年9月2日作成者: jarxiv

要約私たちは、シングルパスでセマンティック構成の 3D シーンを生成できる拡 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

Bridging Episodes and Semantics: A Novel Framework for Long-Form Video Understanding

投稿日: 2024年9月2日作成者: jarxiv

要約既存の研究では、長い形式のビデオを拡張された短いビデオとして扱うことがよく … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning

Abstracted Gaussian Prototypes for One-Shot Concept Learning

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation

Investigating Neuron Ablation in Attention Heads: The Case for Peak Activation Centering

A Permuted Autoregressive Approach to Word-Level Recognition for Urdu Digital Text

Open-vocabulary Temporal Action Localization using VLMs

Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane

Bridging Episodes and Semantics: A Novel Framework for Long-Form Video Understanding

最近の投稿

最近のコメント

アーカイブ

カテゴリー