「cs.CV」カテゴリーアーカイブ

SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields

投稿日: 2024年8月14日作成者: jarxiv

要約複雑な視覚シーンからオブジェクト中心の抽象化を抽出する能力は、人間レベルの … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Automatic Spatial Calibration of Near-Field MIMO Radar With Respect to Optical Depth Sensors

投稿日: 2024年8月14日作成者: jarxiv

要約 MIMO レーダーへの関心が高まっているにもかかわらず、光学式深度センサー … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Exploring Domain Shift on Radar-Based 3D Object Detection Amidst Diverse Environmental Conditions

投稿日: 2024年8月14日作成者: jarxiv

要約ディープラーニングの急速な進化と自動運転システムとの統合により、マルチモー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

投稿日: 2024年8月14日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は、さまざまな単一イメージ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Enhancing Visual Dialog State Tracking through Iterative Object-Entity Alignment in Multi-Round Conversations

投稿日: 2024年8月14日作成者: jarxiv

要約ビジュアルダイアログ (VD) は、エージェントが複数ラウンドのダイアロ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Sumotosima: A Framework and Dataset for Classifying and Summarizing Otoscopic Images

投稿日: 2024年8月14日作成者: jarxiv

要約耳鏡検査は、耳鏡を使用して外耳道と鼓膜を検査する診断手順です。感染症、異 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection

投稿日: 2024年8月14日作成者: jarxiv

要約テキストと画像の組み合わせを通じて伝えられるソーシャルメディアでの皮肉の … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Visual Neural Decoding via Improved Visual-EEG Semantic Consistency

投稿日: 2024年8月14日作成者: jarxiv

要約視覚神経デコーディングは、人間の脳活動から元の視覚体験を抽出して解釈するプ … 続きを読む →

カテゴリー: cs.CV, cs.HC | コメントを受け付けていません

DA-BEV: Unsupervised Domain Adaptation for Bird’s Eye View Perception

投稿日: 2024年8月14日作成者: jarxiv

要約カメラのみの鳥瞰図 (BEV) は、3D 空間での環境認識において大きな可 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning

投稿日: 2024年8月14日作成者: jarxiv

要約トークン圧縮は、不注意なトークンを削除したり、類似のトークンをマージしたり … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields

Automatic Spatial Calibration of Near-Field MIMO Radar With Respect to Optical Depth Sensors

Exploring Domain Shift on Radar-Based 3D Object Detection Amidst Diverse Environmental Conditions

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

Enhancing Visual Dialog State Tracking through Iterative Object-Entity Alignment in Multi-Round Conversations

Sumotosima: A Framework and Dataset for Classifying and Summarizing Otoscopic Images

InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection

Visual Neural Decoding via Improved Visual-EEG Semantic Consistency

DA-BEV: Unsupervised Domain Adaptation for Bird’s Eye View Perception

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning

最近の投稿

最近のコメント

アーカイブ

カテゴリー