「cs.AI」カテゴリーアーカイブ

StereoVAE: A lightweight stereo-matching system using embedded GPUs

投稿日: 2025年6月11日作成者: jarxiv

要約組み込みGPUを介してステレオマッチング用の軽量システムを提示します。ス … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM, cs.RO | コメントを受け付けていません

Spatial Transcriptomics Expression Prediction from Histopathology Based on Cross-Modal Mask Reconstruction and Contrastive Learning

投稿日: 2025年6月11日作成者: jarxiv

要約空間トランスクリプトミクスは、さまざまな空間的位置で遺伝子発現レベルをキャ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Product of Experts for Visual Generation

投稿日: 2025年6月11日作成者: jarxiv

要約最新のニューラルモデルは、豊富な事前にキャプチャされ、共有されたデータドメ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Inherently Faithful Attention Maps for Vision Transformers

投稿日: 2025年6月11日作成者: jarxiv

要約学習したバイナリ注意マスクを使用して、参加した画像領域のみが予測に影響を与 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Socratic-MCTS: Test-Time Visual Reasoning by Asking the Right Questions

投稿日: 2025年6月11日作成者: jarxiv

要約ビジョン言語モデル（VLMS）の最近の研究は、蒸留と強化学習を通じて、言語 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Segment Concealed Objects with Incomplete Supervision

投稿日: 2025年6月11日作成者: jarxiv

要約不完全に監視されている隠されたオブジェクトセグメンテーション（ISCOS） … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models

投稿日: 2025年6月11日作成者: jarxiv

要約クロスモーダルコントラスト学習を通じて、医療視覚言語の整合により、検索やゼ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Diffuse and Disperse: Image Generation with Representation Regularization

投稿日: 2025年6月11日作成者: jarxiv

要約過去10年間の拡散ベースの生成モデルの開発は、表現学習の進歩とは独立して主 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better

投稿日: 2025年6月11日作成者: jarxiv

要約典型的な大規模なビジョン言語モデル（LVLMS）は、視覚的モダリティを学習 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

投稿日: 2025年6月11日作成者: jarxiv

要約動的環境で複数の具体化されたエージェントを調整することは、人工知能の中心的 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

StereoVAE: A lightweight stereo-matching system using embedded GPUs

Spatial Transcriptomics Expression Prediction from Histopathology Based on Cross-Modal Mask Reconstruction and Contrastive Learning

Product of Experts for Visual Generation

Inherently Faithful Attention Maps for Vision Transformers

Socratic-MCTS: Test-Time Visual Reasoning by Asking the Right Questions

Segment Concealed Objects with Incomplete Supervision

Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models

Diffuse and Disperse: Image Generation with Representation Regularization

Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー