「cs.AI」カテゴリーアーカイブ

Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion

投稿日: 2025年5月27日作成者: jarxiv

要約拡散モデルは、テキストからイメージの生成の主流のアーキテクチャとなっており … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.MM | コメントを受け付けていません

AdaTP: Attention-Debiased Token Pruning for Video Large Language Models

投稿日: 2025年5月27日作成者: jarxiv

要約ビデオ大規模な言語モデル（ビデオLLM）は、ビデオ理解のタスクで顕著な結果 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Improvement Strategies for Few-Shot Learning in OCT Image Classification of Rare Retinal Diseases

投稿日: 2025年5月27日作成者: jarxiv

要約このペーパーでは、少数のショット学習を使用して、OCT診断画像を主要かつ希 … 続きを読む →

カテゴリー: cs.AI, cs.CV, eess.IV | コメントを受け付けていません

Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models

投稿日: 2025年5月27日作成者: jarxiv

要約大規模な自然なシーン画像で対比訓練された視覚エンコーダーの恩恵を受けて、大 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

EVM-Fusion: An Explainable Vision Mamba Architecture with Neural Algorithmic Fusion

投稿日: 2025年5月27日作成者: jarxiv

要約医療画像の分類は臨床的意思決定には重要ですが、正確性、解釈可能性、一般化に … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Open the Eyes of MPNN: Vision Enhances MPNN in Link Prediction

投稿日: 2025年5月27日作成者: jarxiv

要約メッセージパスグラフニューラルネットワーク（MPNNS）と構造的特徴（SF … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

DTRT: Enhancing Human Intent Estimation and Role Allocation for Physical Human-Robot Collaboration

投稿日: 2025年5月27日作成者: jarxiv

要約物理的な人間のロボットコラボレーション（PHRC）では、正確な人間の意図の … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

LiloDriver: A Lifelong Learning Framework for Closed-loop Motion Planning in Long-tail Autonomous Driving Scenarios

投稿日: 2025年5月26日作成者: jarxiv

要約堅牢で安全で適応的なモーションプランナーに対する自律的な運転研究の最近の進 … 続きを読む →

カテゴリー: 68T05, cs.AI, cs.RO, I.2.6 | コメントを受け付けていません

Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets

投稿日: 2025年5月26日作成者: jarxiv

要約模倣学習は、ジェネラリストのロボットを構築するための有望なアプローチとして … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

Bootstrapping Imitation Learning for Long-horizon Manipulation via Hierarchical Data Collection Space

投稿日: 2025年5月26日作成者: jarxiv

要約人間のデモを備えた模倣学習（IL）は、ロボット操作タスクの有望な方法です。 … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion

AdaTP: Attention-Debiased Token Pruning for Video Large Language Models

Improvement Strategies for Few-Shot Learning in OCT Image Classification of Rare Retinal Diseases

Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models

EVM-Fusion: An Explainable Vision Mamba Architecture with Neural Algorithmic Fusion

Open the Eyes of MPNN: Vision Enhances MPNN in Link Prediction

DTRT: Enhancing Human Intent Estimation and Role Allocation for Physical Human-Robot Collaboration

LiloDriver: A Lifelong Learning Framework for Closed-loop Motion Planning in Long-tail Autonomous Driving Scenarios

Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets

Bootstrapping Imitation Learning for Long-horizon Manipulation via Hierarchical Data Collection Space

最近の投稿

最近のコメント

アーカイブ

カテゴリー