投稿者「jarxiv」のアーカイブ

Multi-view Structural Convolution Network for Domain-Invariant Point Cloud Recognition of Autonomous Vehicles

投稿日: 2025年5月1日作成者: jarxiv

要約ポイントクラウドの表現は最近、コンピュータービジョンの分野での研究ホットス … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Explorations of the Softmax Space: Knowing When the Neural Network Doesn’t Know

投稿日: 2025年5月1日作成者: jarxiv

要約人工知能システムが重要な状況でより広く展開されるため、ニューラルネットワー … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields

投稿日: 2025年5月1日作成者: jarxiv

要約 AIGC Foundationモデルの急速な発展は、画像圧縮のパラダイムに … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Early Exit and Multi Stage Knowledge Distillation in VLMs for Video Summarization

投稿日: 2025年5月1日作成者: jarxiv

要約 Deevisum（要約のための蒸留早期出口ビジョン言語モデル）を紹介します … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

ObjectFinder: An Open-Vocabulary Assistive System for Interactive Object Search by Blind People

投稿日: 2025年5月1日作成者: jarxiv

要約なじみのないシナリオでオブジェクトを検索することは、盲人にとって挑戦的な作 … 続きを読む →

カテゴリー: cs.CV, cs.HC | コメントを受け付けていません

Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games

投稿日: 2025年5月1日作成者: jarxiv

要約ビデオゲームは意思決定コミュニティにとって有用なベンチマークとして機能しま … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

3D Stylization via Large Reconstruction Model

投稿日: 2025年5月1日作成者: jarxiv

要約テキストまたはイメージガイド付き3Dジェネレーターの成功が高まっているため … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Active Light Modulation to Counter Manipulation of Speech Visual Content

投稿日: 2025年5月1日作成者: jarxiv

要約有名なスピーチビデオは、そのアクセシビリティと影響力のために、偽造の主要な … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.CV | コメントを受け付けていません

Differentiable Room Acoustic Rendering with Multi-View Vision Priors

投稿日: 2025年5月1日作成者: jarxiv

要約空間オーディオによって可能になった没入型の音響体験は、現実的な仮想環境を作 … 続きを読む →

カテゴリー: cs.CV, cs.SD | コメントを受け付けていません

COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning

投稿日: 2025年5月1日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLM）は、単純なビジョン言語タスクに優れ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

Multi-view Structural Convolution Network for Domain-Invariant Point Cloud Recognition of Autonomous Vehicles

Explorations of the Softmax Space: Knowing When the Neural Network Doesn’t Know

Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields

Early Exit and Multi Stage Knowledge Distillation in VLMs for Video Summarization

ObjectFinder: An Open-Vocabulary Assistive System for Interactive Object Search by Blind People

Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games

3D Stylization via Large Reconstruction Model

Active Light Modulation to Counter Manipulation of Speech Visual Content

Differentiable Room Acoustic Rendering with Multi-View Vision Priors

COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning

最近の投稿

最近のコメント

アーカイブ

カテゴリー