Cherry-Picking with Reinforcement Learning

要約

不安定または非剛性の材料に囲まれた小さな物体をつかむことは、手術、収穫、建設、災害復旧、補助給餌などの用途で重要な役割を果たします。
このタスクは、センサーのノイズや認識エラーが存在する中で細かい操作が必要な場合に特に困難です。
これは必然的に動的な動きを引き起こしますが、これを正確にモデル化することは困難です。
接触とダイナミクスの正確なモデルを構築することの難しさを回避し、強化学習 (RL) などのデータ駆動型の方法を使用すると、試行錯誤によってタスクのパフォーマンスを最適化できます。
ただし、これらの方法を実際のロボットに適用することは、サンプルの複雑さが法外に高いことや、ハードウェアのリセットを提供するためのトレーニングインフラストラクチャのコストが高いことなどの要因によって妨げられてきました。
この作品は、箸を使って細かい操作を行う RL システムである CherryBot を紹介しています。
トレーニングパラダイムとアルゴリズムを慎重に設計することにより、監視に必要な人間の労力を削減しながら、現実世界のロボット学習システムサンプルを効率的かつ一般的にする方法を研究します。
私たちのシステムは、30 分間の実世界でのインタラクションを通じて継続的な改善を示しています。反応的な再試行により、箸を使って空中で揺れる小さな物体をつかむという要求の厳しいタスクで、ほぼ 100% の成功率を達成しています。
さまざまなオブジェクトの形状とダイナミクス (風や人間の摂動などの外乱など) に対する CherryBot の反応性、堅牢性、および一般化可能性を示します。
ビデオは https://goodcherrybot.github.io/ で入手できます。

要約(オリジナル)

Grasping small objects surrounded by unstable or non-rigid material plays a crucial role in applications such as surgery, harvesting, construction, disaster recovery, and assisted feeding. This task is especially difficult when fine manipulation is required in the presence of sensor noise and perception errors; this inevitably triggers dynamic motion, which is challenging to model precisely. Circumventing the difficulty to build accurate models for contacts and dynamics, data-driven methods like reinforcement learning (RL) can optimize task performance via trial and error. Applying these methods to real robots, however, has been hindered by factors such as prohibitively high sample complexity or the high training infrastructure cost for providing resets on hardware. This work presents CherryBot, an RL system that uses chopsticks for fine manipulation that surpasses human reactiveness for some dynamic grasping tasks. By carefully designing the training paradigm and algorithm, we study how to make a real-world robot learning system sample efficient and general while reducing the human effort required for supervision. Our system shows continual improvement through 30 minutes of real-world interaction: through reactive retry, it achieves an almost 100% success rate on the demanding task of using chopsticks to grasp small objects swinging in the air. We demonstrate the reactiveness, robustness and generalizability of CherryBot to varying object shapes and dynamics (e.g., external disturbances like wind and human perturbations). Videos are available at https://goodcherrybot.github.io/.

arxiv情報

著者	Yunchu Zhang,Liyiming Ke,Abhay Deshpande,Abhishek Gupta,Siddhartha Srinivasa
発行日	2023-03-09 18:59:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Cherry-Picking with Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー