「cs.AI」カテゴリーアーカイブ

Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction

投稿日: 2024年4月11日作成者: jarxiv

要約建物は社会的生産と人間の生活の基本的な担い手です。道路はソーシャルネッ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Data-Efficient Multimodal Fusion on a Single GPU

投稿日: 2024年4月11日作成者: jarxiv

要約マルチモーダルアライメントの目標は、マルチモーダル入力間で共有される単一 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Location-guided Head Pose Estimation for Fisheye Image

投稿日: 2024年4月11日作成者: jarxiv

要約魚眼レンズまたは超広角レンズを備えたカメラは、透視投影ではモデル化できない … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Understanding Video Transformers via Universal Concept Discovery

投稿日: 2024年4月11日作成者: jarxiv

要約この論文では、ビデオのトランス表現の概念ベースの解釈可能性の問題を研究しま … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Measuring proximity to standard planes during fetal brain ultrasound scanning

投稿日: 2024年4月11日作成者: jarxiv

要約この論文では、胎児の脳内の標準平面 (SP) へのより効果的なナビゲーショ … 続きを読む →

カテゴリー: cs.AI, cs.CV, I.2.0 | コメントを受け付けていません

Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces

投稿日: 2024年4月11日作成者: jarxiv

要約 Explainable AI は、予測に対する説明を生成することで、ニュー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion

投稿日: 2024年4月11日作成者: jarxiv

要約テキストの説明から一般的な前向き 3D シーンを生成する技術である Rea … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR, cs.LG | コメントを受け付けていません

UMBRAE: Unified Multimodal Decoding of Brain Signals

投稿日: 2024年4月11日作成者: jarxiv

要約私たちは、文献では正確な空間情報がほとんど復元されておらず、主題固有のモデ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

BRAVE: Broadening the visual encoding of vision-language models

投稿日: 2024年4月11日作成者: jarxiv

要約ビジョン言語モデル (VLM) は通常、ビジョンエンコーダで構成されます … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models

投稿日: 2024年4月11日作成者: jarxiv

要約本稿では、ドラッグ編集の安定性と画質を向上させる新しいアプローチである G … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR, cs.LG, cs.MM | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction

Data-Efficient Multimodal Fusion on a Single GPU

Location-guided Head Pose Estimation for Fisheye Image

Understanding Video Transformers via Universal Concept Discovery

Measuring proximity to standard planes during fetal brain ultrasound scanning

Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces

RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion

UMBRAE: Unified Multimodal Decoding of Brain Signals

BRAVE: Broadening the visual encoding of vision-language models

GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー