Coastal Underwater Evidence Search System with Surface-Underwater Collaboration




The Coastal underwater evidence search system with surface-underwater collaboration is designed to revolutionize the search for artificial objects in coastal underwater environments, overcoming limitations associated with traditional methods such as divers and tethered remotely operated vehicles. Our innovative multi-robot collaborative system consists of three parts, an autonomous surface vehicle as a mission control center, a towed underwater vehicle for wide-area search, and a biomimetic underwater robot inspired by marine organisms for detailed inspections of identified areas. We conduct extensive simulations and real-world experiments in pond environments and coastal fields to demonstrate the system potential to surpass the limitations of conventional underwater search methods, offering a robust and efficient solution for law enforcement and recovery operations in marine settings.


著者 Hin Wang Lin,Pengyu Wang,Zhaohua Yang,Ka Chun Leung,Fangming Bao,Ka Yu Kui,Jian Xiang Erik Xu,Ling Shi
発行日 2024-10-03 09:57:19+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.RO | コメントする

Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks




Safe and successful deployment of robots requires not only the ability to generate complex plans but also the capacity to frequently replan and correct execution errors. This paper addresses the challenge of long-horizon trajectory planning under temporally extended objectives in a receding horizon manner. To this end, we propose DOPPLER, a data-driven hierarchical framework that generates and updates plans based on instruction specified by linear temporal logic (LTL). Our method decomposes temporal tasks into chain of options with hierarchical reinforcement learning from offline non-expert datasets. It leverages diffusion models to generate options with low-level actions. We devise a determinantal-guided posterior sampling technique during batch generation, which improves the speed and diversity of diffusion generated options, leading to more efficient querying. Experiments on robot navigation and manipulation tasks demonstrate that DOPPLER can generate sequences of trajectories that progressively satisfy the specified formulae for obstacle avoidance and sequential visitation. Demonstration videos are available online at:


著者 Zeyu Feng,Hao Luan,Kevin Yuchen Ma,Harold Soh
発行日 2024-10-03 11:10:37+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.AI, cs.LG, cs.RO | コメントする

RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation


RiEMannは、SE(3)-Equivariant Robot Manipulationの模倣学習フレームワークである。記述子フィールドのマッチングに依存する従来の手法と比較して、RiEMannはオブジェクトのセグメンテーションを行うことなく、操作の対象となるオブジェクトのポーズを直接予測する。RiEMannは、5~10回のデモンストレーションにより、ゼロから操作タスクを学習し、未知のSE(3)変換やターゲットオブジェクトのインスタンスに汎化し、注意散漫なオブジェクトの視覚干渉に抵抗し、ターゲットオブジェクトのほぼリアルタイムの姿勢変化に追従する。RiEMannのスケーラブルなアクション空間は、蛇口を回す方向などのカスタム等変量アクションの追加を容易にし、RiEMannの多関節物体操作を可能にする。シミュレーションと実世界の6自由度ロボット操作実験において、RiEMannを5つのカテゴリの操作タスクと合計25のバリエーションでテストし、RiEMannがタスク成功率と予測ポーズのSE(3)測地距離誤差(68.6%減少)の両方でベースラインを上回り、5.4フレーム/秒(FPS)のネットワーク推論速度を達成することを示す。コードとビデオの結果は。


We present RiEMann, an end-to-end near Real-time SE(3)-Equivariant Robot Manipulation imitation learning framework from scene point cloud input. Compared to previous methods that rely on descriptor field matching, RiEMann directly predicts the target poses of objects for manipulation without any object segmentation. RiEMann learns a manipulation task from scratch with 5 to 10 demonstrations, generalizes to unseen SE(3) transformations and instances of target objects, resists visual interference of distracting objects, and follows the near real-time pose change of the target object. The scalable action space of RiEMann facilitates the addition of custom equivariant actions such as the direction of turning the faucet, which makes articulated object manipulation possible for RiEMann. In simulation and real-world 6-DOF robot manipulation experiments, we test RiEMann on 5 categories of manipulation tasks with a total of 25 variants and show that RiEMann outperforms baselines in both task success rates and SE(3) geodesic distance errors on predicted poses (reduced by 68.6%), and achieves a 5.4 frames per second (FPS) network inference speed. Code and video results are available at


著者 Chongkai Gao,Zhengrong Xue,Shuying Deng,Tianhai Liang,Siqi Yang,Lin Shao,Huazhe Xu
発行日 2024-10-03 11:13:29+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.AI, cs.RO | コメントする

PointNetPGAP-SLC: A 3D LiDAR-based Place Recognition Approach with Segment-level Consistency Training for Mobile Robots in Horticulture




3D LiDAR-based place recognition remains largely underexplored in horticultural environments, which present unique challenges due to their semi-permeable nature to laser beams. This characteristic often results in highly similar LiDAR scans from adjacent rows, leading to descriptor ambiguity and, consequently, compromised retrieval performance. In this work, we address the challenges of 3D LiDAR place recognition in horticultural environments, particularly focusing on inter-row ambiguity by introducing three key contributions: (i) a novel model, PointNetPGAP, which combines the outputs of two statistically-inspired aggregators into a single descriptor; (ii) a Segment-Level Consistency (SLC) model, used exclusively during training to enhance descriptor robustness; and (iii) the HORTO-3DLM dataset, comprising LiDAR sequences from orchards and strawberry fields. Experimental evaluations conducted on the HORTO-3DLM and KITTI Odometry datasets demonstrate that PointNetPGAP outperforms state-of-the-art models, including OverlapTransformer and PointNetVLAD, particularly when the SLC model is applied. These results underscore the model’s superiority, especially in horticultural environments, by significantly improving retrieval performance in segments with higher ambiguity.


著者 T. Barros,L. Garrote,P. Conde,M. J. Coombes,C. Liu,C. Premebida,U. J. Nunes
発行日 2024-10-03 13:09:43+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.RO | コメントする

Behavior Trees in Functional Safety Supervisors for Autonomous Vehicles


自律走行車のソフトウェアの急速な進歩は、特に交通安全の向上において、チャンスと課題の両方をもたらしている。自律走行車の主な目的は、安全対策の改善を通じて事故率を低減することである。しかし、人工知能手法のような新しいアルゴリズムを自律走行車両に統合することは、確立された安全規制の遵守に関する懸念を引き起こす。本論文では、確立された基準に沿い、リアルタイムで車両の機能安全を監督するために設計された、ビヘイビアツリーに基づく新しいソフトウェアアーキテクチャを紹介する。特に、ISO 26262に準拠した産業用道路車両へのアルゴリズムの統合を取り上げる。提案された監督方法論は、危険の検出と、危険発生時の機能的・技術的安全要件の遵守を含む。この方法論は、本研究でルノー・メカネ(現在SAE自動化レベル3)に実装され、安全基準への準拠を保証するだけでなく、より安全で信頼性の高い自律走行技術への道を開くものである。


The rapid advancements in autonomous vehicle software present both opportunities and challenges, especially in enhancing road safety. The primary objective of autonomous vehicles is to reduce accident rates through improved safety measures. However, the integration of new algorithms into the autonomous vehicle, such as Artificial Intelligence methods, raises concerns about the compliance with established safety regulations. This paper introduces a novel software architecture based on behavior trees, aligned with established standards and designed to supervise vehicle functional safety in real time. It specifically addresses the integration of algorithms into industrial road vehicles, adhering to the ISO 26262. The proposed supervision methodology involves the detection of hazards and compliance with functional and technical safety requirements when a hazard arises. This methodology, implemented in this study in a Renault M\’egane (currently at SAE level 3 of automation), not only guarantees compliance with safety standards, but also paves the way for safer and more reliable autonomous driving technologies.


著者 Carlos Conejo,Vicenç Puig,Bernardo Morcego,Francisco Navas,Vicente Milanés
発行日 2024-10-03 13:19:38+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.RO, cs.SY, eess.SY | コメントする

Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping




Universal dexterous grasping across diverse objects presents a fundamental yet formidable challenge in robot learning. Existing approaches using reinforcement learning (RL) to develop policies on extensive object datasets face critical limitations, including complex curriculum design for multi-task learning and limited generalization to unseen objects. To overcome these challenges, we introduce ResDex, a novel approach that integrates residual policy learning with a mixture-of-experts (MoE) framework. ResDex is distinguished by its use of geometry-unaware base policies that are efficiently acquired on individual objects and capable of generalizing across a wide range of unseen objects. Our MoE framework incorporates several base policies to facilitate diverse grasping styles suitable for various objects. By learning residual actions alongside weights that combine these base policies, ResDex enables efficient multi-task RL for universal dexterous grasping. ResDex achieves state-of-the-art performance on the DexGraspNet dataset comprising 3,200 objects with an 88.8% success rate. It exhibits no generalization gap with unseen objects and demonstrates superior training efficiency, mastering all tasks within only 12 hours on a single GPU.


著者 Ziye Huang,Haoqi Yuan,Yuhui Fu,Zongqing Lu
発行日 2024-10-03 13:33:02+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.LG, cs.RO | コメントする

Learning Diverse Bimanual Dexterous Manipulation Skills from Human Demonstrations




Bimanual dexterous manipulation is a critical yet underexplored area in robotics. Its high-dimensional action space and inherent task complexity present significant challenges for policy learning, and the limited task diversity in existing benchmarks hinders general-purpose skill development. Existing approaches largely depend on reinforcement learning, often constrained by intricately designed reward functions tailored to a narrow set of tasks. In this work, we present a novel approach for efficiently learning diverse bimanual dexterous skills from abundant human demonstrations. Specifically, we introduce BiDexHD, a framework that unifies task construction from existing bimanual datasets and employs teacher-student policy learning to address all tasks. The teacher learns state-based policies using a general two-stage reward function across tasks with shared behaviors, while the student distills the learned multi-task policies into a vision-based policy. With BiDexHD, scalable learning of numerous bimanual dexterous skills from auto-constructed tasks becomes feasible, offering promising advances toward universal bimanual dexterous manipulation. Our empirical evaluation on the TACO dataset, spanning 141 tasks across six categories, demonstrates a task fulfillment rate of 74.59% on trained tasks and 51.07% on unseen tasks, showcasing the effectiveness and competitive zero-shot generalization capabilities of BiDexHD. For videos and more information, visit our project page


著者 Bohan Zhou,Haoqi Yuan,Yuhui Fu,Zongqing Lu
発行日 2024-10-03 13:35:15+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.LG, cs.RO | コメントする

Cross-Embodiment Dexterous Grasping with Reinforcement Learning




Dexterous hands exhibit significant potential for complex real-world grasping tasks. While recent studies have primarily focused on learning policies for specific robotic hands, the development of a universal policy that controls diverse dexterous hands remains largely unexplored. In this work, we study the learning of cross-embodiment dexterous grasping policies using reinforcement learning (RL). Inspired by the capability of human hands to control various dexterous hands through teleoperation, we propose a universal action space based on the human hand’s eigengrasps. The policy outputs eigengrasp actions that are then converted into specific joint actions for each robot hand through a retargeting mapping. We simplify the robot hand’s proprioception to include only the positions of fingertips and the palm, offering a unified observation space across different robot hands. Our approach demonstrates an 80% success rate in grasping objects from the YCB dataset across four distinct embodiments using a single vision-based policy. Additionally, our policy exhibits zero-shot generalization to two previously unseen embodiments and significant improvement in efficient finetuning. For further details and videos, visit our project page


著者 Haoqi Yuan,Bohan Zhou,Yuhui Fu,Zongqing Lu
発行日 2024-10-03 13:36:02+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.LG, cs.RO | コメントする

A Causal Bayesian Network and Probabilistic Programming Based Reasoning Framework for Robot Manipulation Under Uncertainty


実環境におけるロボットの物体操作は困難である。というのも、ロボットの操作は、危険でコストのかかるミスを回避するために、様々なセンシング、推定、作動の不確実性に対してロバストでなければならないからである。本論文では、任意のロボットシステムの不確実性にロバストなロボットの意思決定を可能にするために、ロボットが操作動作の候補を確率的に推論するための、柔軟で一般化可能な物理情報因果ベイズネットワーク(CBN)ベースのフレームワークを提案する。ブロック積み上げタスクの高忠実度Gazeboシミュレーション実験を用いて、我々のフレームワークの能力を実証する:(1)操作結果を高い精度で予測する(Pred Acc: 88.6%)、(2)貪欲な次善行動選択を94.2%のタスク成功率で実行する。また、家庭用ロボットを用いて、本フレームワークが実世界のロボットシステムに適していることを実証する。このように、確率的因果モデリングと物理シミュレーションを組み合わせることで、ロボットの操作をシステムの不確実性に対してより頑健にすることができ、ひいては実世界での応用がより実現可能であることを示す。さらに、我々の一般化された推論フレームワークは、将来のロボット工学と因果性の研究に利用、拡張することができる。


Robot object manipulation in real-world environments is challenging because robot operation must be robust to a range of sensing, estimation, and actuation uncertainties to avoid potentially unsafe and costly mistakes that are a barrier to their adoption. In this paper, we propose a flexible and generalisable physics-informed causal Bayesian network (CBN) based framework for a robot to probabilistically reason about candidate manipulation actions, to enable robot decision-making robust to arbitrary robot system uncertainties — the first of its kind to use a probabilistic programming language implementation. Using experiments in high-fidelity Gazebo simulation of an exemplar block stacking task, we demonstrate our framework’s ability to: (1) predict manipulation outcomes with high accuracy (Pred Acc: 88.6%); and, (2) perform greedy next-best action selection with 94.2% task success rate. We also demonstrate our framework’s suitability for real-world robot systems with a domestic robot. Thus, we show that by combining probabilistic causal modelling with physics simulations, we can make robot manipulation more robust to system uncertainties and hence more feasible for real-world applications. Further, our generalised reasoning framework can be used and extended for future robotics and causality research.


著者 Ricardo Cannizzaro,Michael Groom,Jonathan Routley,Robert Osazuwa Ness,Lars Kunze
発行日 2024-10-03 14:16:47+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.AI, cs.LG, cs.RO, G.3, stat.AP | コメントする

SwarmCVT: Centroidal Voronoi Tessellation-Based Path Planning for Very-Large-Scale Robotics




Swarm robotics, or very large-scale robotics (VLSR), has many meaningful applications for complicated tasks. However, the complexity of motion control and energy costs stack up quickly as the number of robots increases. In addressing this problem, our previous studies have formulated various methods employing macroscopic and microscopic approaches. These methods enable microscopic robots to adhere to a reference Gaussian mixture model (GMM) distribution observed at the macroscopic scale. As a result, optimizing the macroscopic level will result in an optimal overall result. However, all these methods require systematic and global generation of Gaussian components (GCs) within obstacle-free areas to construct the GMM trajectories. This work utilizes centroidal Voronoi tessellation to generate GCs methodically. Consequently, it demonstrates performance improvement while also ensuring consistency and reliability.


著者 James Gao,Jacob Lee,Yuting Zhou,Yunze Hu,Chang Liu,Pingping Zhu
発行日 2024-10-03 14:17:20+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, DeepL

カテゴリー: cs.MA, cs.RO, cs.SY, eess.SY | コメントする