投稿者「jarxiv」のアーカイブ

Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait

投稿日: 2025年5月8日作成者: jarxiv

要約制約のない環境における全身の人認識の問題に対処します。この問題は、高度お … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

投稿日: 2025年5月8日作成者: jarxiv

要約ビジョンは、特に視覚サーボを使用して、操作での使用でよく知られています。 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

On Path to Multimodal Generalist: General-Level and General-Bench

投稿日: 2025年5月8日作成者: jarxiv

要約 Multimodal Large Languageモデル（MLLM）は、L … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer

投稿日: 2025年5月8日作成者: jarxiv

要約複雑な3D形状を単純な幾何学的要素に分解し、人間の視覚認知において重要な役 … 続きを読む →

カテゴリー: cs.CV, cs.GR | コメントを受け付けていません

EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning

投稿日: 2025年5月8日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLM）は、テキスト、ビジョン、オーディオ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Vision-Language Models Create Cross-Modal Task Representations

投稿日: 2025年5月8日作成者: jarxiv

要約自己回帰ビジョン言語モデル（VLM）は、単一のモデル内で多くのタスクを処理 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Anant-Net: Breaking the Curse of Dimensionality with Scalable and Interpretable Neural Surrogate for High-Dimensional PDEs

投稿日: 2025年5月8日作成者: jarxiv

要約高次元の部分微分方程式（PDE）は、多様な科学的および工学的アプリケーショ … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

Learning Survival Distributions with the Asymmetric Laplace Distribution

投稿日: 2025年5月8日作成者: jarxiv

要約確率論的生存分析モデルは、一連の共変量を与えられたイベントの将来の発生（時 … 続きを読む →

カテゴリー: cs.LG, math.ST, stat.TH | コメントを受け付けていません

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

投稿日: 2025年5月8日作成者: jarxiv

要約検証可能な報酬（RLVR）による強化学習は、結果ベースの報酬から直接学習す … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Enhancing Target-unspecific Tasks through a Features Matrix

投稿日: 2025年5月8日作成者: jarxiv

要約大規模なビジョン言語モデルの迅速な学習の最近の開発により、ターゲット固有の … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

On Path to Multimodal Generalist: General-Level and General-Bench

PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer

EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning

Vision-Language Models Create Cross-Modal Task Representations

Anant-Net: Breaking the Curse of Dimensionality with Scalable and Interpretable Neural Surrogate for High-Dimensional PDEs

Learning Survival Distributions with the Asymmetric Laplace Distribution

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Enhancing Target-unspecific Tasks through a Features Matrix

最近の投稿

最近のコメント

アーカイブ

カテゴリー