Adaptive Endpointing with Deep Contextual Multi-armed Bandits

要約

現在のエンドポイント (EP) ソリューションは監視されたフレームワークで学習するため、モデルにフィードバックを組み込み、オンライン設定で改善することはできません。
また、コストのかかるグリッド検索を利用して、エンドポイントモデルに最適な構成を見つけることも一般的な方法です。
このホワイトペーパーでは、ハイパーパラメータグリッド検索を回避しながら、オンライン設定で発話レベルのオーディオ機能が与えられた場合に最適なエンドポイント構成を選択する効率的な方法を提案することにより、適応エンドポイントのソリューションを提供することを目的としています。
私たちの方法は、グラウンドトゥルースラベルを必要とせず、注釈付きラベルを必要とせずに、報酬シグナルからのオンライン学習のみを使用します。
具体的には、ニューラルネットワークの表現力と Thompson モデリングアルゴリズムのアクション探索動作を組み合わせた、深いコンテキストに基づく多腕バンディットベースのアプローチを提案します。
アプローチをいくつかのベースラインと比較し、ディープバンディットモデルも、低レイテンシを維持しながら早期のカットオフエラーを減らすことに成功していることを示します。

要約(オリジナル)

Current endpointing (EP) solutions learn in a supervised framework, which does not allow the model to incorporate feedback and improve in an online setting. Also, it is a common practice to utilize costly grid-search to find the best configuration for an endpointing model. In this paper, we aim to provide a solution for adaptive endpointing by proposing an efficient method for choosing an optimal endpointing configuration given utterance-level audio features in an online setting, while avoiding hyperparameter grid-search. Our method does not require ground truth labels, and only uses online learning from reward signals without requiring annotated labels. Specifically, we propose a deep contextual multi-armed bandit-based approach, which combines the representational power of neural networks with the action exploration behavior of Thompson modeling algorithms. We compare our approach to several baselines, and show that our deep bandit models also succeed in reducing early cutoff errors while maintaining low latency.

arxiv情報

著者	Do June Min,Andreas Stolcke,Anirudh Raju,Colin Vaz,Di He,Venkatesh Ravichandran,Viet Anh Trinh
発行日	2023-03-23 16:28:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Adaptive Endpointing with Deep Contextual Multi-armed Bandits

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー