LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices

要約

LibriWASN は、LibriCSS 会議認識データセットに厳密に準拠した設計のデータセットですが、データが会議テーブル上にランダムに配置され、サンプリングクロックが同期していないデバイスで記録されるという大きな違いがあります。
単一の録音チャンネルを備えた 5 台のスマートフォンと 4 つのマイクアレイの 9 台の異なるデバイスを使用して、合計 29 チャンネルを録音します。
それ以外では、データセットは LibriCSS の設計に厳密に従っています。同じ LibriSpeech 文が会議テーブルの周りに配置された 8 つのスピーカーから再生され、データは音声の重複率が異なるサブセットに編成されます。
LibriWASN は、アドホック無線音響センサーネットワーク上のクロック同期アルゴリズム、会議分離、日記化、転写システムのテストセットとして意図されています。
LibriCSS との類似性により、前者用に開発された会議転写システムは LibriWASN で簡単にテストできます。
データセットは 2 つの異なる部屋で記録され、誰がいつ話したかという真実の日記情報が補完されています。

要約(オリジナル)

We present LibriWASN, a data set whose design follows closely the LibriCSS meeting recognition data set, with the marked difference that the data is recorded with devices that are randomly positioned on a meeting table and whose sampling clocks are not synchronized. Nine different devices, five smartphones with a single recording channel and four microphone arrays, are used to record a total of 29 channels. Other than that, the data set follows closely the LibriCSS design: the same LibriSpeech sentences are played back from eight loudspeakers arranged around a meeting table and the data is organized in subsets with different percentages of speech overlap. LibriWASN is meant as a test set for clock synchronization algorithms, meeting separation, diarization and transcription systems on ad-hoc wireless acoustic sensor networks. Due to its similarity to LibriCSS, meeting transcription systems developed for the former can readily be tested on LibriWASN. The data set is recorded in two different rooms and is complemented with ground-truth diarization information of who speaks when.

arxiv情報

著者	Joerg Schmalenstroeer,Tobias Gburrek,Reinhold Haeb-Umbach
発行日	2023-08-21 12:33:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー