「cs.AI」カテゴリーアーカイブ

PIF: Anomaly detection via preference embedding

投稿日: 2025年5月16日作成者: jarxiv

要約構造化されたパターンに関する異常を検出する問題に対処します。この目的のた … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, stat.ML | コメントを受け付けていません

Vision language models have difficulty recognizing virtual objects

投稿日: 2025年5月16日作成者: jarxiv

要約 Vision Language Models（VLMS）は、Multimo … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

SEAL: Searching Expandable Architectures for Incremental Learning

投稿日: 2025年5月16日作成者: jarxiv

要約インクリメンタル学習は、モデルがタスクの連続ストリームから学習する機械学習 … 続きを読む →

カテゴリー: 68T07, cs.AI, cs.CV, cs.LG | コメントを受け付けていません

UniEval: Unified Holistic Evaluation for Unified Multimodal Understanding and Generation

投稿日: 2025年5月16日作成者: jarxiv

要約統一されたマルチモーダルの理解と生成モデルの出現は、モデルの冗長性を最小限 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Multi-Token Prediction Needs Registers

投稿日: 2025年5月16日作成者: jarxiv

要約マルチトークンの予測は、言語モデルの事前トレーニングを改善するための有望な … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data

投稿日: 2025年5月16日作成者: jarxiv

要約光リアリックな拡散モデルの開発により、合成データで部分的または完全にトレー … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

投稿日: 2025年5月16日作成者: jarxiv

要約大規模なマルチモーダルモデルのトレーニングに広く使用されている自然言語画像 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering

投稿日: 2025年5月16日作成者: jarxiv

要約この論文では、\ underline {a} ncial \ underl … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model

投稿日: 2025年5月16日作成者: jarxiv

要約材料科学の知識は、広範な科学文献全体に広く分散されており、新しい材料の効率 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

RT-cache: Efficient Robot Trajectory Retrieval System

投稿日: 2025年5月15日作成者: jarxiv

要約このホワイトペーパーでは、ビッグデータの検索を活用して経験から学ぶことによ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

PIF: Anomaly detection via preference embedding

Vision language models have difficulty recognizing virtual objects

SEAL: Searching Expandable Architectures for Incremental Learning

UniEval: Unified Holistic Evaluation for Unified Multimodal Understanding and Generation

Multi-Token Prediction Needs Registers

Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering

Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model

RT-cache: Efficient Robot Trajectory Retrieval System

最近の投稿

最近のコメント

アーカイブ

カテゴリー