「cs.CV」カテゴリーアーカイブ

Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology

投稿日: 2025年5月15日作成者: jarxiv

要約都市環境での航空視覚オブジェクト検索（AVOS）タスクでは、外部ガイダンス … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Enhancing Scene Coordinate Regression with Efficient Keypoint Detection and Sequential Information

投稿日: 2025年5月14日作成者: jarxiv

要約シーン座標回帰（SCR）は、ディープニューラルネットワーク（DNN）を利用 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation

投稿日: 2025年5月14日作成者: jarxiv

要約 UAV-VLA（Visual-Language-action）システムは、 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression

投稿日: 2025年5月14日作成者: jarxiv

要約この論文では、自己網性モデルと視覚運動ポリシーを学習するための拡散モデルを … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control

投稿日: 2025年5月14日作成者: jarxiv

要約ロボットがさまざまな環境で多様なタスクを実行できるようにすることは、ロボッ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation

投稿日: 2025年5月14日作成者: jarxiv

要約 Vision-Language-action（VLA）モデルは、エンドツー … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Towards Anytime Optical Flow Estimation with Event Cameras

投稿日: 2025年5月14日作成者: jarxiv

要約イベントカメラは、ミリ秒レベルでのlog輝度の変化に応答し、光学フローの推 … 続きを読む →

カテゴリー: cs.CV, cs.RO, eess.IV | コメントを受け付けていません

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

投稿日: 2025年5月14日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLMS）の急速な発展により、これらのモデ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?

投稿日: 2025年5月14日作成者: jarxiv

要約チャートは、人々がデータを理解し、推論するのを助けるため、遍在しています。 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Dynamic Snake Upsampling Operater and Boundary-Skeleton Weighted Loss for Tubular Structure Segmentation

投稿日: 2025年5月14日作成者: jarxiv

要約尿細管トポロジー構造（亀裂や血管系など）の正確なセグメンテーションは、さま … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology

Enhancing Scene Coordinate Regression with Efficient Keypoint Detection and Sequential Information

UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation

Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression

DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control

TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation

Towards Anytime Optical Flow Estimation with Event Cameras

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?

Dynamic Snake Upsampling Operater and Boundary-Skeleton Weighted Loss for Tubular Structure Segmentation

最近の投稿

最近のコメント

アーカイブ

カテゴリー