Mobile-Seed: Joint Semantic Segmentation and Boundary Detection for Mobile Robots

要約

鋭い境界と堅牢なセマンティクスを正確かつ迅速に描写することは、ロボットの把握と操作、リアルタイムのセマンティックマッピング、エッジコンピューティングユニットで実行されるオンラインセンサーキャリブレーションなど、数多くの下流ロボットタスクに不可欠です。
境界検出とセマンティックセグメンテーションは相補的なタスクですが、ほとんどの研究はセマンティックセグメンテーションの軽量モデルに焦点を当てており、境界検出の重要な役割が見落とされています。
この作業では、セマンティックセグメンテーションと境界検出を同時に行うために調整された軽量のデュアルタスクフレームワークである Mobile-Seed を紹介します。
私たちのフレームワークは、2 ストリームエンコーダー、アクティブフュージョンデコーダー (AFD)、およびデュアルタスク正則化アプローチを特徴としています。
エンコーダーは 2 つの経路に分割されます。1 つはカテゴリーを認識したセマンティック情報を取得し、もう 1 つはマルチスケールの特徴から境界を識別します。
AFD モジュールは、チャネルごとの関係を学習することでセマンティック情報と境界情報の融合を動的に適応させ、各チャネルの正確な重み割り当てを可能にします。
さらに、二重タスク学習と深い多様性の監視における矛盾を緩和するために、正則化損失を導入します。
既存の方法と比較して、提案された Mobile-Seed は、セマンティックセグメンテーションのパフォーマンスを向上させ、オブジェクトの境界を正確に特定するための軽量フレームワークを提供します。
Cityscapes データセットの実験では、Mobile-Seed がオンライン推論速度を維持しながら、最先端 (SOTA) ベースラインに対して、mIoU で 2.2 パーセントポイント (pp)、mF スコアで 4.2 pp の顕著な改善を達成することが示されました。
RTX 2080 Ti GPU 上の 1024×2048 解像度入力で 23.9 フレーム/秒 (FPS) のパフォーマンスを実現します。
CamVid および PASCAL Context データセットでの追加の実験により、私たちの方法の一般化可能性が確認されました。
コードと追加の結果は、https://whu-usi3dv.github.io/Mobile-Seed/ で公開されています。

要約(オリジナル)

Precise and rapid delineation of sharp boundaries and robust semantics is essential for numerous downstream robotic tasks, such as robot grasping and manipulation, real-time semantic mapping, and online sensor calibration performed on edge computing units. Although boundary detection and semantic segmentation are complementary tasks, most studies focus on lightweight models for semantic segmentation but overlook the critical role of boundary detection. In this work, we introduce Mobile-Seed, a lightweight, dual-task framework tailored for simultaneous semantic segmentation and boundary detection. Our framework features a two-stream encoder, an active fusion decoder (AFD) and a dual-task regularization approach. The encoder is divided into two pathways: one captures category-aware semantic information, while the other discerns boundaries from multi-scale features. The AFD module dynamically adapts the fusion of semantic and boundary information by learning channel-wise relationships, allowing for precise weight assignment of each channel. Furthermore, we introduce a regularization loss to mitigate the conflicts in dual-task learning and deep diversity supervision. Compared to existing methods, the proposed Mobile-Seed offers a lightweight framework to simultaneously improve semantic segmentation performance and accurately locate object boundaries. Experiments on the Cityscapes dataset have shown that Mobile-Seed achieves notable improvement over the state-of-the-art (SOTA) baseline by 2.2 percentage points (pp) in mIoU and 4.2 pp in mF-score, while maintaining an online inference speed of 23.9 frames-per-second (FPS) with 1024×2048 resolution input on an RTX 2080 Ti GPU. Additional experiments on CamVid and PASCAL Context datasets confirm our method’s generalizability. Code and additional results are publicly available at https://whu-usi3dv.github.io/Mobile-Seed/.

arxiv情報

著者	Youqi Liao,Shuhao Kang,Jianping Li,Yang Liu,Yun Liu,Zhen Dong,Bisheng Yang,Xieyuanli Chen
発行日	2023-11-23 16:38:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Mobile-Seed: Joint Semantic Segmentation and Boundary Detection for Mobile Robots

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー