「cs.CV」カテゴリーアーカイブ

A Survey of Embodied Learning for Object-Centric Robotic Manipulation

投稿日: 2024年8月22日作成者: jarxiv

要約オブジェクト中心のロボット操作のための身体化学習は、身体化 AI において … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

RaNDT SLAM: Radar SLAM Based on Intensity-Augmented Normal Distributions Transform

投稿日: 2024年8月22日作成者: jarxiv

要約レスキューロボット工学では、構造化されておらず、視覚が否定される可能性があ … 続きを読む →

カテゴリー: cs.CV, cs.RO, eess.SP | コメントを受け付けていません

ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2

投稿日: 2024年8月22日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は、その多機能性で大きな注目 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Self-Supervised Visual Preference Alignment

投稿日: 2024年8月22日作成者: jarxiv

要約この論文は、視覚言語モデル (VLM) における教師なしの好みの調整に向け … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

投稿日: 2024年8月22日作成者: jarxiv

要約モデルのマージは、機械学習コミュニティにおける効率的なエンパワーメント手法 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Multi-Grained Query-Guided Set Prediction Network for Grounded Multimodal Named Entity Recognition

投稿日: 2024年8月22日作成者: jarxiv

要約 Grounded Multimodal Named Entity Reco … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.IR | コメントを受け付けていません

FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance

投稿日: 2024年8月22日作成者: jarxiv

要約 CLIP は、画像とテキストのペアのデータで構成される大規模なデータセット … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Hierarchical Salient Patch Identification for Interpretable Fundus Disease Localization

投稿日: 2024年8月22日作成者: jarxiv

要約医用画像解析におけるディープラーニング技術の応用の広がりに伴い、モデル予測 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A New Chinese Landscape Paintings Generation Model based on Stable Diffusion using DreamBooth

投稿日: 2024年8月22日作成者: jarxiv

要約この研究では主に、中国の山水画を生成するための安定拡散モデル (SDM) … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline Context

投稿日: 2024年8月22日作成者: jarxiv

要約ビジュアルストーリーテリングでは、文字とシーンの一貫性を維持しながら、テキ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

A Survey of Embodied Learning for Object-Centric Robotic Manipulation

RaNDT SLAM: Radar SLAM Based on Intensity-Augmented Normal Distributions Transform

ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2

Self-Supervised Visual Preference Alignment

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Multi-Grained Query-Guided Set Prediction Network for Grounded Multimodal Named Entity Recognition

FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance

Hierarchical Salient Patch Identification for Interpretable Fundus Disease Localization

A New Chinese Landscape Paintings Generation Model based on Stable Diffusion using DreamBooth

ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline Context

最近の投稿

最近のコメント

アーカイブ

カテゴリー