投稿者「jarxiv」のアーカイブ

ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos

投稿日: 2025年4月18日作成者: jarxiv

要約人間中心の3D世界の認識において、単一の単眼内の野生のビデオフィギュアから … 続きを読む →

カテゴリー: cs.CV, I.4.5 | コメントを受け付けていません

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

投稿日: 2025年4月18日作成者: jarxiv

要約ビジョン言語モデル（VLM）は視覚的な理解に優れていますが、視覚的な幻覚に … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities

投稿日: 2025年4月18日作成者: jarxiv

要約状況に応じたコミュニケーションでの空間的表現は、スピーカーやリスナーが採用 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

IMAGGarment-1: Fine-Grained Garment Generation for Controllable Fashion Design

投稿日: 2025年4月18日作成者: jarxiv

要約このホワイトペーパーでは、シルエット、色、ロゴの配置を正確に制御できる高忠 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Single-Shot Shape and Reflectance with Spatial Polarization Multiplexing

投稿日: 2025年4月18日作成者: jarxiv

要約単一の偏光画像からオブジェクトの形状と反射率を再構築するための空間分極マル … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

投稿日: 2025年4月18日作成者: jarxiv

要約ビジョン言語モデルはコンピュータービジョンの研究に不可欠ですが、多くの高性 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

ViTa-Zero: Zero-shot Visuotactile Object 6D Pose Estimation

投稿日: 2025年4月18日作成者: jarxiv

要約オブジェクト6Dのポーズ推定は、特に操作タスクでは、ロボット工学の重要な課 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Perception Encoder: The best visual embeddings are not at the output of the network

投稿日: 2025年4月18日作成者: jarxiv

要約単純なビジョン言語学習を通じてトレーニングされた画像およびビデオ理解用の最 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image

投稿日: 2025年4月18日作成者: jarxiv

要約このペーパーは、ロボット操作タスクにおける明確なオブジェクトのカテゴリレベ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Bayesian dynamic borrowing considering semantic similarity between outcomes for disproportionality analysis in FAERS

投稿日: 2025年4月18日作成者: jarxiv

要約自発的な報告システム（SRSS）の有害事象（AES）の定量的識別を強化する … 続きを読む →

カテゴリー: cs.CL, G.3 | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities

IMAGGarment-1: Fine-Grained Garment Generation for Controllable Fashion Design

Single-Shot Shape and Reflectance with Spatial Polarization Multiplexing

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

ViTa-Zero: Zero-shot Visuotactile Object 6D Pose Estimation

Perception Encoder: The best visual embeddings are not at the output of the network

CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image

Bayesian dynamic borrowing considering semantic similarity between outcomes for disproportionality analysis in FAERS

最近の投稿

最近のコメント

アーカイブ

カテゴリー