「cs.CV」カテゴリーアーカイブ

S2S-Net: Addressing the Domain Gap of Heterogeneous Sensor Systems in LiDAR-Based Collective Perception

投稿日: 2025年4月25日作成者: jarxiv

要約集団認識（CP）は、自律運転の文脈における個々の認識の限界を克服するための … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

M-MRE: Extending the Mutual Reinforcement Effect to Multimodal Information Extraction

投稿日: 2025年4月25日作成者: jarxiv

要約相互補強効果（MRE）は、情報抽出とモデルの解釈可能性の交差点の新興サブフ … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.MM | コメントを受け付けていません

TimeSoccer: An End-to-End Multimodal Large Language Model for Soccer Commentary Generation

投稿日: 2025年4月25日作成者: jarxiv

要約サッカーは世界的に人気のあるスポーツイベントであり、通常、長い試合と特徴的 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding

投稿日: 2025年4月25日作成者: jarxiv

要約大規模なマルチモーダルモデル（LMM）では印象的な進歩がありました。最近 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Enhanced Sample Selection with Confidence Tracking: Identifying Correctly Labeled yet Hard-to-Learn Samples in Noisy Data

投稿日: 2025年4月25日作成者: jarxiv

要約ノイズの多いラベルが存在する場合の画像分類のための新しいサンプル選択方法を … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Disentangling Visual Transformers: Patch-level Interpretability for Image Classification

投稿日: 2025年4月25日作成者: jarxiv

要約視覚的な変圧器は、画像分類タスクで顕著なパフォーマンスを達成していますが、 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Latent Representations for Visual Proprioception in Inexpensive Robots

投稿日: 2025年4月25日作成者: jarxiv

要約ロボット操作には、ロボットの関節位置に関する明示的または暗黙的な知識が必要 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

投稿日: 2025年4月25日作成者: jarxiv

要約サブジェクト駆動型のテキストからイメージ（T2I）Generationは、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Mamba-Sea: A Mamba-based Framework with Global-to-Local Sequence Augmentation for Generalizable Medical Image Segmentation

投稿日: 2025年4月25日作成者: jarxiv

要約分布シフトで医療画像をセグメント化するために、ドメイン一般化（DG）は、目 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Towards One-Stage End-to-End Table Structure Recognition with Parallel Regression for Diverse Scenarios

投稿日: 2025年4月25日作成者: jarxiv

要約テーブル構造の認識は、構造化されていないデータのテーブルを機械理解可能な形 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

S2S-Net: Addressing the Domain Gap of Heterogeneous Sensor Systems in LiDAR-Based Collective Perception

M-MRE: Extending the Mutual Reinforcement Effect to Multimodal Information Extraction

TimeSoccer: An End-to-End Multimodal Large Language Model for Soccer Commentary Generation

FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding

Enhanced Sample Selection with Confidence Tracking: Identifying Correctly Labeled yet Hard-to-Learn Samples in Noisy Data

Disentangling Visual Transformers: Patch-level Interpretability for Image Classification

Latent Representations for Visual Proprioception in Inexpensive Robots

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

Mamba-Sea: A Mamba-based Framework with Global-to-Local Sequence Augmentation for Generalizable Medical Image Segmentation

Towards One-Stage End-to-End Table Structure Recognition with Parallel Regression for Diverse Scenarios

最近の投稿

最近のコメント

アーカイブ

カテゴリー