「cs.AI」カテゴリーアーカイブ

Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models

投稿日: 2025年6月2日作成者: jarxiv

要約テキストからイメージへの生成モデルは、しばしば複雑なシーン、明確な視覚的特 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics

投稿日: 2025年6月2日作成者: jarxiv

要約ダイナミックハンドジェスチャーは、特にモビリティの制約を備えた個人や操作ロ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software

投稿日: 2025年6月2日作成者: jarxiv

要約コンピューター支援設計（CAD）は、時間のかかる複雑なプロセスであり、複雑 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck

投稿日: 2025年6月2日作成者: jarxiv

要約このペーパーでは、最先端の大規模な言語モデル（LLM）が私たちの視覚的世界 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Time Blindness: Why Video-Language Models Can’t See What Humans Can?

投稿日: 2025年6月2日作成者: jarxiv

要約ビジョン言語モデル（VLM）の最近の進歩は、ビデオで時空間的関係を理解する … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

ProxyThinker: Test-Time Guidance through Small Visual Reasoners

投稿日: 2025年6月2日作成者: jarxiv

要約検証可能な報酬による強化学習の最近の進歩により、大規模なビジョン言語モデル … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

投稿日: 2025年6月2日作成者: jarxiv

要約 Captchasは、実際のアプリケーションにWebエージェントを展開するた … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Keyed Chaotic Masking: A Functional Privacy Framework for Neural Inference

投稿日: 2025年6月2日作成者: jarxiv

要約この作業では、暗号化されたカオスダイナミカルシステムに由来する決定論的でユ … 続きを読む →

カテゴリー: 37N25, 68T05, 94A60, cs.AI, cs.CR, D.4.6 | コメントを受け付けていません

Semantic Exploration and Dense Mapping of Complex Environments using Ground Robots Equipped with LiDAR and Panoramic Camera

投稿日: 2025年5月30日作成者: jarxiv

要約このペーパーでは、Lidar-Panoramic Camera Suite … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

CoordField: Coordination Field for Agentic UAV Task Allocation In Low-altitude Urban Scenarios

投稿日: 2025年5月30日作成者: jarxiv

要約都市環境で複雑なタスクを実行するために不均一な無人航空機（UAV）の群れに … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models

DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics

VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software

Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck

Time Blindness: Why Video-Language Models Can’t See What Humans Can?

ProxyThinker: Test-Time Guidance through Small Visual Reasoners

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

Keyed Chaotic Masking: A Functional Privacy Framework for Neural Inference

Semantic Exploration and Dense Mapping of Complex Environments using Ground Robots Equipped with LiDAR and Panoramic Camera

CoordField: Coordination Field for Agentic UAV Task Allocation In Low-altitude Urban Scenarios

最近の投稿

最近のコメント

アーカイブ

カテゴリー