jarxiv | Japanese arxiv

Tailless Flapping-Wing Robot With Bio-Inspired Elastic Passive Legs for Multi-Modal Locomotion

投稿日: 2025年6月19日作成者: jarxiv

要約

羽ばたき翼ロボットは、かなりの汎用性を提供します。
ただし、効率的なマルチモーダル移動を達成することは依然として困難です。
このペーパーでは、3つの独立した作動の翼のペアを備えた新しいテールレスフラッピングウィングロボットのデザイン、モデリング、および実験を紹介します。
少年水ストライダーの脚の形態に触発されたこのロボットには、羽ばたき誘発性の振動を方向性の地面の動きに変換するバイオ風の弾性パッシブ脚が組み込まれ、追加のアクチュエーターなしで運動を可能にします。
この振動駆動型メカニズムは、軽量で機械的に単純化されたマルチモーダルモビリティを促進します。
SE（3）ベースのコントローラーは、最小限の作動で飛行とモードの遷移を調整します。
ロボットの実現可能性を検証するために、機能的なプロトタイプが開発され、その飛行、地上移動、およびモードスイッチング機能を評価するための実験が行われました。
結果は、制約された作動の下で満足のいくパフォーマンスを示しており、将来の航空地面ロボットアプリケーションのマルチモーダルフラッピングウィングデザインの可能性を強調しています。
これらの調査結果は、ハイブリッド運動システムにおける周波数ベースの地球制御と受動的なヨーの安定化に関する将来の研究の基盤を提供します。

要約(オリジナル)

Flapping-wing robots offer significant versatility; however, achieving efficient multi-modal locomotion remains challenging. This paper presents the design, modeling, and experimentation of a novel tailless flapping-wing robot with three independently actuated pairs of wings. Inspired by the leg morphology of juvenile water striders, the robot incorporates bio-inspired elastic passive legs that convert flapping-induced vibrations into directional ground movement, enabling locomotion without additional actuators. This vibration-driven mechanism facilitates lightweight, mechanically simplified multi-modal mobility. An SE(3)-based controller coordinates flight and mode transitions with minimal actuation. To validate the robot’s feasibility, a functional prototype was developed, and experiments were conducted to evaluate its flight, ground locomotion, and mode-switching capabilities. Results show satisfactory performance under constrained actuation, highlighting the potential of multi-modal flapping-wing designs for future aerial-ground robotic applications. These findings provide a foundation for future studies on frequency-based terrestrial control and passive yaw stabilization in hybrid locomotion systems.

arxiv情報

著者	Zhi Zheng,Xiangyu Xu,Jin Wang,Yikai Chen,Jingyang Huang,Ruixin Wu,Huan Yu,Guodong Lu
発行日	2025-06-18 02:19:48+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.RO | コメントを受け付けていません

EmojiVoice: Towards long-term controllable expressivity in robot speech

投稿日: 2025年6月19日作成者: jarxiv

要約

人間は、リスナーとの関わりを維持するために長期間話すときに表現力を変えます。
ソーシャルロボットは「表現力豊かな」喜びの声で展開される傾向がありますが、人間のスピーチに見られるこの長期的なバリエーションが欠けています。
ファンデーションモデルのテキストからスピーチシステムは、人間のスピーチにおける表現力を模倣し始めていますが、ロボットにオフラインで展開することは困難です。
ソーシャルロボット奏者がソーシャルロボットに関する一時的に変動する表現力豊かなスピーチを構築できるようにする、無料のカスタマイズ可能なテキストからスピーチ（TTS）ツールキットであるemojivoiceを紹介します。
絵文字を導入して、位相レベルで表現力の細かい制御を可能にし、軽量の抹茶TTTSバックボーンを使用してスピーチをリアルタイムで生成します。
（1）ロボットアシスタントとのスクリプト化された会話、（2）ストーリーテリングロボット、および（3）自律的なスピーチ間インタラクティブエージェント。
さまざまな絵文字を使用することで、ストーリーテリングタスクで長期にわたってスピーチの認識と表現力が向上することがわかりましたが、アシスタントユースケースでは表現力のある声が好まれていませんでした。

要約(オリジナル)

Humans vary their expressivity when speaking for extended periods to maintain engagement with their listener. Although social robots tend to be deployed with “expressive” joyful voices, they lack this long-term variation found in human speech. Foundation model text-to-speech systems are beginning to mimic the expressivity in human speech, but they are difficult to deploy offline on robots. We present EmojiVoice, a free, customizable text-to-speech (TTS) toolkit that allows social roboticists to build temporally variable, expressive speech on social robots. We introduce emoji-prompting to allow fine-grained control of expressivity on a phase level and use the lightweight Matcha-TTS backbone to generate speech in real-time. We explore three case studies: (1) a scripted conversation with a robot assistant, (2) a storytelling robot, and (3) an autonomous speech-to-speech interactive agent. We found that using varied emoji prompting improved the perception and expressivity of speech over a long period in a storytelling task, but expressive voice was not preferred in the assistant use case.

arxiv情報

著者	Paige Tuttösí,Shivam Mehta,Zachary Syvenky,Bermet Burkanova,Gustav Eje Henter,Angelica Lim
発行日	2025-06-18 02:49:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.HC, cs.RO | コメントを受け付けていません

3D Vision-tactile Reconstruction from Infrared and Visible Images for Robotic Fine-grained Tactile Perception

投稿日: 2025年6月19日作成者: jarxiv

要約

擬人化グリッパーで人間のような触覚知覚を達成するには、視力触覚センサー（VTS）の準拠したセンシング表面は、従来の平面構成から連続表面勾配を持つ生体模倣的に湾曲した地形に進化する必要があります。
ただし、平面VTSには、表面の照明が不十分である、再構築のぼやけ、表面構造の複雑な空間境界条件など、湾曲した表面に拡張すると課題があります。
人間のような指先を構築するという最終目標により、私たちの研究は、プリズムと近赤外（NIR）カメラでイメージングチャネルを拡張することによりGelsPlitter3Dを開発します。
表面積分のエラー。
より良い触覚センシングパフォーマンス、通常の推定精度の40 $ \％$の改善、および把握および操作タスクにおけるセンサー形状の利点を示します。

要約(オリジナル)

To achieve human-like haptic perception in anthropomorphic grippers, the compliant sensing surfaces of vision tactile sensor (VTS) must evolve from conventional planar configurations to biomimetically curved topographies with continuous surface gradients. However, planar VTSs have challenges when extended to curved surfaces, including insufficient lighting of surfaces, blurring in reconstruction, and complex spatial boundary conditions for surface structures. With an end goal of constructing a human-like fingertip, our research (i) develops GelSplitter3D by expanding imaging channels with a prism and a near-infrared (NIR) camera, (ii) proposes a photometric stereo neural network with a CAD-based normal ground truth generation method to calibrate tactile geometry, and (iii) devises a normal integration method with boundary constraints of depth prior information to correcting the cumulative error of surface integrals. We demonstrate better tactile sensing performance, a 40$\%$ improvement in normal estimation accuracy, and the benefits of sensor shapes in grasping and manipulation tasks.

arxiv情報

著者	Yuankai Lin,Xiaofan Lu,Jiahui Chen,Hua Yang
発行日	2025-06-18 02:53:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.RO | コメントを受け付けていません

DyNaVLM: Zero-Shot Vision-Language Navigation System with Dynamic Viewpoints and Self-Refining Graph Memory

投稿日: 2025年6月19日作成者: jarxiv

要約

Vision-Language Models（VLM）を使用して、エンドツーエンドのビジョン言語ナビゲーションフレームワークであるDynavlmを紹介します。
固定角度または距離間隔によって制約されている以前の方法とは対照的に、システムはエージェントが視覚言語の推論を介してナビゲーションターゲットを自由に選択できるようにします。
その中心には、1）オブジェクトの場所を実行可能なトポロジ関係として保存する自己強化グラフメモリがあり、2）分散グラフの更新を介してクロスロボットメモリ共有を可能にし、3）検索の増強を介してVLMの意思決定を強化します。
タスク固有のトレーニングや微調整なしで動作するDynavlmは、ヤギとObjectNavのベンチマークで高性能を示します。
実際のテストは、その堅牢性と一般化をさらに検証します。
システムの3つの革新：動的アクションスペースの定式化、共同グラフメモリ、およびトレーニングフリーの展開は、スケーラブルな具体化されたロボットの新しいパラダイムを確立し、離散VLNタスクと連続的な現実世界のナビゲーションの間のギャップを埋めます。

要約(オリジナル)

We present DyNaVLM, an end-to-end vision-language navigation framework using Vision-Language Models (VLM). In contrast to prior methods constrained by fixed angular or distance intervals, our system empowers agents to freely select navigation targets via visual-language reasoning. At its core lies a self-refining graph memory that 1) stores object locations as executable topological relations, 2) enables cross-robot memory sharing through distributed graph updates, and 3) enhances VLM’s decision-making via retrieval augmentation. Operating without task-specific training or fine-tuning, DyNaVLM demonstrates high performance on GOAT and ObjectNav benchmarks. Real-world tests further validate its robustness and generalization. The system’s three innovations: dynamic action space formulation, collaborative graph memory, and training-free deployment, establish a new paradigm for scalable embodied robot, bridging the gap between discrete VLN tasks and continuous real-world navigation.

arxiv情報

著者	Zihe Ji,Huangxuan Lin,Yue Gao
発行日	2025-06-18 03:06:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.RO | コメントを受け付けていません

I Know You’re Listening: Adaptive Voice for HRI

投稿日: 2025年6月19日作成者: jarxiv

要約

言語教育のためのソーシャルロボットの使用が調査されていますが、言語教育ロボットのためのタスク固有の統合された声に関する作業は限られています。
言語が口頭での仕事であることを考えると、このギャップは、言語教育タスクに対するロボットの有効性に深刻な結果をもたらす可能性があります。
私たちは、3つの貢献を通じてL2の教育ロボットの声のこの欠如に対処します。1。軽量で表現力豊かなロボットの声の必要性に対処します。
抹茶TTSの微調整バージョンを使用して、絵文字を使用して、時間の経過とともにさまざまな表現力を示す表現力のある音声を作成します。
音声は、限られた計算リソースでリアルタイムで実行できます。
ケーススタディを通じて、この声はより表現力豊かで、社会的に適切であり、ストーリーテリングなどの長期にわたる表現力豊かなスピーチに適していることがわかりました。
2.ロボットの声を物理的および社会的な周囲の環境に適応させる方法を探り、さまざまな場所に声を展開します。
ノイズの多いエネルギー環境でピッチとピッチレートの増加により、ロボットの声がより適切に見えるようになり、現在の環境をよりよく認識するようになることがわかりました。
3.これらのリスナーにとって困難な母音の既知の言語特性を使用して、L2リスナーの明確さを改善した英語TTSシステムを作成します。
データ駆動型の知覚ベースのアプローチを使用して、L2スピーカーが英語の最小緊張（長い）およびLAX（短い）母音で挑戦的な単語を解釈する方法を理解するために理解しました。
母音の持続時間は、L2リスナーの認識に強く影響し、LAX母音を変更しながら緊張した母音に延長を適用する抹茶TTの「L2クラリティモード」を作成したことがわかりました。
私たちのクラリティモードは、これらの挑戦的な時制/ゆるい最小ペアの転写エラーを減らしながら、ベースの抹茶TTよりも敬意を払い、わかりやすく、励みになっていることがわかりました。

要約(オリジナル)

While the use of social robots for language teaching has been explored, there remains limited work on a task-specific synthesized voices for language teaching robots. Given that language is a verbal task, this gap may have severe consequences for the effectiveness of robots for language teaching tasks. We address this lack of L2 teaching robot voices through three contributions: 1. We address the need for a lightweight and expressive robot voice. Using a fine-tuned version of Matcha-TTS, we use emoji prompting to create an expressive voice that shows a range of expressivity over time. The voice can run in real time with limited compute resources. Through case studies, we found this voice more expressive, socially appropriate, and suitable for long periods of expressive speech, such as storytelling. 2. We explore how to adapt a robot’s voice to physical and social ambient environments to deploy our voices in various locations. We found that increasing pitch and pitch rate in noisy and high-energy environments makes the robot’s voice appear more appropriate and makes it seem more aware of its current environment. 3. We create an English TTS system with improved clarity for L2 listeners using known linguistic properties of vowels that are difficult for these listeners. We used a data-driven, perception-based approach to understand how L2 speakers use duration cues to interpret challenging words with minimal tense (long) and lax (short) vowels in English. We found that the duration of vowels strongly influences the perception for L2 listeners and created an ‘L2 clarity mode’ for Matcha-TTS that applies a lengthening to tense vowels while leaving lax vowels unchanged. Our clarity mode was found to be more respectful, intelligible, and encouraging than base Matcha-TTS while reducing transcription errors in these challenging tense/lax minimal pairs.

arxiv情報

著者	Paige Tuttösí
発行日	2025-06-18 03:23:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.HC, cs.RO, cs.SD, eess.AS | コメントを受け付けていません

VIMS: A Visual-Inertial-Magnetic-Sonar SLAM System in Underwater Environments

投稿日: 2025年6月19日作成者: jarxiv

要約

この研究では、水中ナビゲーション向けに設計された新しい同時ローカリゼーションとマッピング（SLAM）システム、VIMSを提示します。
従来の視覚慣性状態の推定器は、特にスケールの推定とループの閉鎖において、知覚的に劣化した水中環境において重要な実際的な課題に遭遇します。
これらの問題に対処するために、最初に低コストのシングルビームソナーを活用してスケール推定を改善することを提案します。
次に、VIMSは、経済的磁場コイルによって生成された磁気特徴を利用することにより、場所認識のために高サンプリングレート磁力計を統合します。
これに基づいて、視覚的磁気の認識のために階層スキームが開発され、堅牢なループ閉鎖が可能になります。
さらに、VIMは、ローカル機能追跡と記述子ベースのループクロージングのバランスをとり、フロントエンドの追加の計算負担を回避します。
実験結果は、提案されたVIMの有効性を強調し、水中環境内の状態推定の堅牢性と精度の両方に大幅な改善を示しています。

要約(オリジナル)

In this study, we present a novel simultaneous localization and mapping (SLAM) system, VIMS, designed for underwater navigation. Conventional visual-inertial state estimators encounter significant practical challenges in perceptually degraded underwater environments, particularly in scale estimation and loop closing. To address these issues, we first propose leveraging a low-cost single-beam sonar to improve scale estimation. Then, VIMS integrates a high-sampling-rate magnetometer for place recognition by utilizing magnetic signatures generated by an economical magnetic field coil. Building on this, a hierarchical scheme is developed for visual-magnetic place recognition, enabling robust loop closure. Furthermore, VIMS achieves a balance between local feature tracking and descriptor-based loop closing, avoiding additional computational burden on the front end. Experimental results highlight the efficacy of the proposed VIMS, demonstrating significant improvements in both the robustness and accuracy of state estimation within underwater environments.

arxiv情報

著者	Bingbing Zhang,Huan Yin,Shuo Liu,Fumin Zhang,Wen Xu
発行日	2025-06-18 03:53:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.RO | コメントを受け付けていません

Booster Gym: An End-to-End Reinforcement Learning Framework for Humanoid Robot Locomotion

投稿日: 2025年6月19日作成者: jarxiv

要約

補強学習（RL）の最近の進歩により、ヒューマノイドロボットの移動が大幅に進歩し、シミュレーションにおけるモーションポリシーの設計とトレーニングが簡素化されました。
ただし、多数の実装の詳細により、これらのポリシーを実際のロボットに転送することは困難なタスクになります。
これに対処するために、トレーニングから展開までのプロセス全体をカバーする包括的なコードフレームワークを開発しました。一般的なRLトレーニング方法、ドメインランダム化、報酬機能設計、および並列構造を処理するためのソリューションを組み込みました。
このライブラリは、その設計と実験結果の詳細な説明を含むコミュニティリソースとして利用可能になります。
ブースターT1ロボットのフレームワークを検証し、訓練されたポリシーが物理プラットフォームにシームレスに転送され、全方向ウォーキング、障害抵抗、地形の適応性などの機能を可能にすることを実証します。
この作品がロボットコミュニティに便利なツールを提供し、ヒューマノイドロボットの開発を加速させることを願っています。
コードはhttps://github.com/boosterrobotics/booster_gymにあります。

要約(オリジナル)

Recent advancements in reinforcement learning (RL) have led to significant progress in humanoid robot locomotion, simplifying the design and training of motion policies in simulation. However, the numerous implementation details make transferring these policies to real-world robots a challenging task. To address this, we have developed a comprehensive code framework that covers the entire process from training to deployment, incorporating common RL training methods, domain randomization, reward function design, and solutions for handling parallel structures. This library is made available as a community resource, with detailed descriptions of its design and experimental results. We validate the framework on the Booster T1 robot, demonstrating that the trained policies seamlessly transfer to the physical platform, enabling capabilities such as omnidirectional walking, disturbance resistance, and terrain adaptability. We hope this work provides a convenient tool for the robotics community, accelerating the development of humanoid robots. The code can be found in https://github.com/BoosterRobotics/booster_gym.

arxiv情報

著者	Yushi Wang,Penghui Chen,Xinyu Han,Feng Wu,Mingguo Zhao
発行日	2025-06-18 04:24:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.RO | コメントを受け付けていません

SurgSora: Object-Aware Diffusion Model for Controllable Surgical Video Generation

投稿日: 2025年6月19日作成者: jarxiv

要約

外科的ビデオ生成は医学教育と研究を強化する可能性がありますが、既存の方法にはきめ細かいモーションコントロールとリアリズムがありません。
Surgsoraを紹介します。これは、単一の入力フレームとユーザー指定のモーションキューから高忠実度、モーション制御可能な外科ビデオを生成するフレームワークを紹介します。
オブジェクトを無差別に処理したり、地上の真実のセグメンテーションマスクに依存している以前のアプローチとは異なり、Surgsoraは自己予測されたオブジェクトの特徴と深さ情報を活用して、RGBの外観と光学フローを正確なビデオ統合に洗練させます。
3つの重要なモジュールで構成されています。（1）デュアルセマンティックインジェクター。オブジェクト固有のRGB-D機能と空間表現を強化するセグメンテーションキューを抽出します。
（2）分離されたフローマッパー。これは、マルチスケールの光学フローを現実的なモーションダイナミクスのセマンティック機能と融合します。
（3）軌道コントローラーは、まばらな光流量を推定し、ユーザーガイド付きオブジェクトの動きを可能にします。
安定したビデオ拡散内でこれらの濃縮された特徴を条件付けることにより、Surgsoraは、広範な定量的および定性的比較によって示されるように、外科的ビデオ統合の進歩において最先端の視覚的信頼性と制御性を実現します。
専門家と協力した人間の評価は、外科的訓練と教育の方法の可能性を強調して、外科手術ビデオの高いリアリズムをさらに実証しています。
当社のプロジェクトは、https：//surgsora.github.io/surgsora.github.ioで入手できます。

要約(オリジナル)

Surgical video generation can enhance medical education and research, but existing methods lack fine-grained motion control and realism. We introduce SurgSora, a framework that generates high-fidelity, motion-controllable surgical videos from a single input frame and user-specified motion cues. Unlike prior approaches that treat objects indiscriminately or rely on ground-truth segmentation masks, SurgSora leverages self-predicted object features and depth information to refine RGB appearance and optical flow for precise video synthesis. It consists of three key modules: (1) the Dual Semantic Injector, which extracts object-specific RGB-D features and segmentation cues to enhance spatial representations; (2) the Decoupled Flow Mapper, which fuses multi-scale optical flow with semantic features for realistic motion dynamics; and (3) the Trajectory Controller, which estimates sparse optical flow and enables user-guided object movement. By conditioning these enriched features within the Stable Video Diffusion, SurgSora achieves state-of-the-art visual authenticity and controllability in advancing surgical video synthesis, as demonstrated by extensive quantitative and qualitative comparisons. Our human evaluation in collaboration with expert surgeons further demonstrates the high realism of SurgSora-generated videos, highlighting the potential of our method for surgical training and education. Our project is available at https://surgsora.github.io/surgsora.github.io.

arxiv情報

著者	Tong Chen,Shuya Yang,Junyi Wang,Long Bai,Hongliang Ren,Luping Zhou
発行日	2025-06-18 04:36:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CV, cs.MM, cs.RO | コメントを受け付けていません

TACT: Humanoid Whole-body Contact Manipulation through Deep Imitation Learning with Tactile Modality

投稿日: 2025年6月19日作成者: jarxiv

要約

ヒューマノイドロボットによる全身接触による操作は、安定性の強化や負荷の削減など、明確な利点を提供します。
一方、モーション生成の計算コストの増加や、広域接触の測定の難しさなどの課題に対処する必要があります。
したがって、上半身の触覚センサーを装備したヒューマノイドロボットが、人間の遠隔術データに基づいた模倣学習を通じて全身操作のポリシーを学ぶことができるヒューマノイド制御システムを開発しました。
触覚モダリティ拡張行為（TACT）という名前のこのポリシーには、関節の位置、視覚、触覚測定など、複数のセンサーモダリティを入力として使用する機能があります。
さらに、このポリシーを、二重型モデルに基づいてリターゲティングおよび移動制御と統合することにより、等身型のヒューマノイドロボットRHP7 Kaleidoがバランスとウォーキングを維持しながら全身接触操作を達成できることを実証します。
詳細な実験的検証を通じて、視力と触覚の両方のモダリティをポリシーに入力することが、広範で繊細な接触を含む操作の堅牢性の改善に貢献することを示します。

要約(オリジナル)

Manipulation with whole-body contact by humanoid robots offers distinct advantages, including enhanced stability and reduced load. On the other hand, we need to address challenges such as the increased computational cost of motion generation and the difficulty of measuring broad-area contact. We therefore have developed a humanoid control system that allows a humanoid robot equipped with tactile sensors on its upper body to learn a policy for whole-body manipulation through imitation learning based on human teleoperation data. This policy, named tactile-modality extended ACT (TACT), has a feature to take multiple sensor modalities as input, including joint position, vision, and tactile measurements. Furthermore, by integrating this policy with retargeting and locomotion control based on a biped model, we demonstrate that the life-size humanoid robot RHP7 Kaleido is capable of achieving whole-body contact manipulation while maintaining balance and walking. Through detailed experimental verification, we show that inputting both vision and tactile modalities into the policy contributes to improving the robustness of manipulation involving broad and delicate contact.

arxiv情報

著者	Masaki Murooka,Takahiro Hoshi,Kensuke Fukumitsu,Shimpei Masuda,Marwan Hamze,Tomoya Sasaki,Mitsuharu Morisawa,Eiichi Yoshida
発行日	2025-06-18 05:04:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.RO | コメントを受け付けていません

Probabilistic Trajectory GOSPA: A Metric for Uncertainty-Aware Multi-Object Tracking Performance Evaluation

投稿日: 2025年6月19日作成者: jarxiv

要約

この論文は、トラックレベルの不確実性を軌跡推定値を提供するマルチオブジェクト追跡アルゴリズムを評価するための軌跡の一般的な最適サブパターン割り当て（GOSPA）メトリックの一般化を示します。
このメトリックは、個々のオブジェクト状態の存在と状態の推定の不確実性の両方を説明するために、最近導入された確率的ゴスパーメトリックに基づいています。
軌跡のゴスパ（TGOSPA）と同様に、それは多次元の割り当ての問題として定式化でき、その線形プログラミング緩和（有効なメトリックも有効なメトリック）が多項式時間で計算可能です。
さらに、このメトリックはTGOSPAの解釈可能性を保持し、その分解が、適切に検出されたオブジェクトの予想されるローカリゼーションエラーと存在確率の不一致エラーに関連する直感的なコスト項を生成することを示します。
提案されたメトリックの有効性は、シミュレーション調査を通じて実証されています。

要約(オリジナル)

This paper presents a generalization of the trajectory general optimal sub-pattern assignment (GOSPA) metric for evaluating multi-object tracking algorithms that provide trajectory estimates with track-level uncertainties. This metric builds on the recently introduced probabilistic GOSPA metric to account for both the existence and state estimation uncertainties of individual object states. Similar to trajectory GOSPA (TGOSPA), it can be formulated as a multidimensional assignment problem, and its linear programming relaxation–also a valid metric–is computable in polynomial time. Additionally, this metric retains the interpretability of TGOSPA, and we show that its decomposition yields intuitive costs terms associated to expected localization error and existence probability mismatch error for properly detected objects, expected missed and false detection error, and track switch error. The effectiveness of the proposed metric is demonstrated through a simulation study.

arxiv情報

著者	Yuxuan Xia,Ángel F. García-Fernández,Johan Karlsson,Yu Ge,Lennart Svensson,Ting Yuan
発行日	2025-06-18 05:11:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.RO, eess.SP | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント