Deep Declarative Dynamic Time Warping for End-to-End Learning of Alignment Paths

要約

このホワイトペーパーでは、動的タイムワーピング (DTW) による時間的アライメントステップを含む時系列データのエンドツーエンドモデルの学習について説明します。
微分可能な DTW への既存のアプローチは、固定ワーピングパスを介して微分するか、DTW 問題を解くために使用される再帰的なステップで見つかった最小演算子に微分可能な緩和を適用します。
代わりに、DecDTW と名付けたバイレベル最適化と深い宣言型ネットワークに基づく DTW レイヤーを提案します。
DTW を連続的な不等式の制約付き最適化問題として定式化することにより、暗黙的な微分を使用して、(基礎となる時系列に関して) 最適なアライメントの解の勾配を計算できます。
この定式化の興味深い副産物は、DecDTW が、Soft-DTW から回復可能なソフトな近似ではなく、2 つの時系列間の最適なワーピングパスを出力することです。
このプロパティは、下流の損失関数が最適なアライメントパス自体で定義されているアプリケーションに特に役立つことを示しています。
これは、たとえば、グラウンドトゥルースアライメントに対する予測アライメントの精度を向上させることを学習するときに自然に発生します。
そのような 2 つのアプリケーション、つまり、音楽情報検索におけるオーディオとスコアのアライメントタスク、およびロボット工学における視覚的な場所認識タスクで DecDTW を評価し、両方で最先端の結果を示します。

要約(オリジナル)

This paper addresses learning end-to-end models for time series data that include a temporal alignment step via dynamic time warping (DTW). Existing approaches to differentiable DTW either differentiate through a fixed warping path or apply a differentiable relaxation to the min operator found in the recursive steps used to solve the DTW problem. We instead propose a DTW layer based around bi-level optimisation and deep declarative networks, which we name DecDTW. By formulating DTW as a continuous, inequality constrained optimisation problem, we can compute gradients for the solution of the optimal alignment (with respect to the underlying time series) using implicit differentiation. An interesting byproduct of this formulation is that DecDTW outputs the optimal warping path between two time series as opposed to a soft approximation, recoverable from Soft-DTW. We show that this property is particularly useful for applications where downstream loss functions are defined on the optimal alignment path itself. This naturally occurs, for instance, when learning to improve the accuracy of predicted alignments against ground truth alignments. We evaluate DecDTW on two such applications, namely the audio-to-score alignment task in music information retrieval and the visual place recognition task in robotics, demonstrating state-of-the-art results in both.

arxiv情報

著者	Ming Xu,Sourav Garg,Michael Milford,Stephen Gould
発行日	2023-03-19 21:58:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Deep Declarative Dynamic Time Warping for End-to-End Learning of Alignment Paths

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー