Streaming Anchor Loss: Augmenting Supervision with Temporal Significance

要約

さまざまな音声信号や感覚信号に対するフレーム単位の高速応答を実現するストリーミングニューラルネットワークモデルは、リソースに制約のあるプラットフォームで広く採用されています。
したがって、予測力を向上させるためにそのようなストリーミングモデルの学習能力を向上させる (つまり、より多くのパラメーターを追加する) ことは、現実世界のタスクでは実行できない可能性があります。
この研究では、モデルがエッセンシャルフレームからより多くの学習を行うことを奨励することで、与えられた学習能力をより有効に活用するために、新しい損失であるストリーミングアンカー損失 (SAL) を提案します。
より具体的には、SAL とその焦点バリエーションは、対応するフレームの重要性に基づいてフレームごとのクロスエントロピー損失を動的に調整するため、意味的に重要なイベントに時間的に近いフレームには、より高い損失ペナルティが割り当てられます。
したがって、私たちの損失により、モデルのトレーニングは比較的まれではあるがタスクに関連するフレームの予測に重点を置くことが保証されます。
3 つの異なる音声ベースの検出タスクにおける標準の軽量畳み込みおよびリカレントストリーミングネットワークを使用した実験結果は、SAL を使用すると、追加のデータ、モデルパラメーター、またはアーキテクチャの変更を行わずに、モデルが精度と遅延を改善してタスク全体をより効果的に学習できることを示しています。

要約(オリジナル)

Streaming neural network models for fast frame-wise responses to various speech and sensory signals are widely adopted on resource-constrained platforms. Hence, increasing the learning capacity of such streaming models (i.e., by adding more parameters) to improve the predictive power may not be viable for real-world tasks. In this work, we propose a new loss, Streaming Anchor Loss (SAL), to better utilize the given learning capacity by encouraging the model to learn more from essential frames. More specifically, our SAL and its focal variations dynamically modulate the frame-wise cross entropy loss based on the importance of the corresponding frames so that a higher loss penalty is assigned for frames within the temporal proximity of semantically critical events. Therefore, our loss ensures that the model training focuses on predicting the relatively rare but task-relevant frames. Experimental results with standard lightweight convolutional and recurrent streaming networks on three different speech based detection tasks demonstrate that SAL enables the model to learn the overall task more effectively with improved accuracy and latency, without any additional data, model parameters, or architectural changes.

arxiv情報

著者	Utkarsh,Sarawgi,John Berkowitz,Vineet Garg,Arnav Kundu,Minsik Cho,Sai Srujana Buddi,Saurabh Adya,Ahmed Tewfik
発行日	2023-10-09 17:28:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Streaming Anchor Loss: Augmenting Supervision with Temporal Significance

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー