「cs.CV」カテゴリーアーカイブ

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

投稿日: 2024年8月16日作成者: jarxiv

要約モデルのマージは、機械学習コミュニティにおける効率的なエンパワーメント手法 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Enhanced Scale-aware Depth Estimation for Monocular Endoscopic Scenes with Geometric Modeling

投稿日: 2024年8月15日作成者: jarxiv

要約スケールを意識した単眼の深度推定は、コンピュータ支援の内視鏡ナビゲーション … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

On the Hidden Mystery of OCR in Large Multimodal Models

投稿日: 2024年8月15日作成者: jarxiv

要約大規模モデルは、最近、自然言語処理とマルチモーダル視覚言語学習において主要 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties

投稿日: 2024年8月15日作成者: jarxiv

要約このペーパーでは、強力な解釈可能なセグメンテーションモデルを作成するため … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion

投稿日: 2024年8月15日作成者: jarxiv

要約 Visual Question Answering (VQA) は、システ … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2

投稿日: 2024年8月15日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は、その多機能性で大きな注目 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution

投稿日: 2024年8月15日作成者: jarxiv

要約これまでの研究では、トランスフォーマーベースの単一画像超解像度 (SISR … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

OMR: Occlusion-Aware Memory-Based Refinement for Video Lane Detection

投稿日: 2024年8月15日作成者: jarxiv

要約この論文ではビデオレーン検出のための新しいアルゴリズムを提案した。まず、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Attention-Guided Perturbation for Unsupervised Image Anomaly Detection

投稿日: 2024年8月15日作成者: jarxiv

要約再構築ベースの手法により、最新の教師なし異常検出が大幅に進歩しました。た … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Integrating Representational Gestures into Automatically Generated Embodied Explanations and its Effects on Understanding and Interaction Quality

投稿日: 2024年8月15日作成者: jarxiv

要約人間の対話において、ジェスチャは、会話のリズムをマークしたり、重要な要素を … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.HC, cs.SD, eess.AS | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Enhanced Scale-aware Depth Estimation for Monocular Endoscopic Scenes with Geometric Modeling

On the Hidden Mystery of OCR in Large Multimodal Models

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties

Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion

ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2

GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution

OMR: Occlusion-Aware Memory-Based Refinement for Video Lane Detection

Attention-Guided Perturbation for Unsupervised Image Anomaly Detection

Integrating Representational Gestures into Automatically Generated Embodied Explanations and its Effects on Understanding and Interaction Quality

最近の投稿

最近のコメント

アーカイブ

カテゴリー