Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing

要約

Mixture of Experts (MoE) アプローチは、マルチエキスパートアーキテクチャであるため、多言語タスクやコードスイッチング (CS) タスクに適しています。
この作業では、バイリンガルおよび CS シナリオ向けに最適化された動的言語グループベースの MoE である DLG-MoE が導入されています。
DLG-MoE は、階層型ルーティングメカニズムに基づいて動作します。
まず、言語ルーターは言語を明示的にモデル化し、その表現を対応する言語専門家グループにディスパッチします。
その後、各言語グループ内の教師なしルーターが言語を超えた属性を暗黙的にモデル化し、専門家のルーティングとコラボレーションを調整します。
このモデルは、比類のない柔軟性を備えながら、最先端 (SOTA) パフォーマンスを実現します。
さまざまな Top-K 推論およびストリーミング機能をサポートし、モデルパラメーターをプルーニングして単一言語のサブモデルを取得することもできます。
コードが公開されます。

要約(オリジナル)

The Mixture of Experts (MoE) approach is well-suited for multilingual and code-switching (CS) tasks due to its multi-expert architecture. This work introduces the DLG-MoE, a Dynamic Language Group-based MoE optimized for bilingual and CS scenarios. DLG-MoE operates based on a hierarchical routing mechanism. First, the language router explicitly models the language and dispatches the representations to the corresponding language expert groups. Subsequently, the unsupervised router within each language group implicitly models attributes beyond language, and coordinates expert routing and collaboration. The model achieves state-of-the-art (SOTA) performance while also having unparalleled flexibility. It supports different top-k inference and streaming capabilities, and can also prune the model parameters to obtain a monolingual sub-model. The Code will be released.

arxiv情報

著者	Hukai Huang,Shenghui Lu,Yahui Shan,He Qu,Wenhao Guan,Qingyang Hong,Lin Li
発行日	2024-08-07 14:19:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー