Recurrent Neural Language Models as Probabilistic Finite-state Automata

要約

よく理解されている形式主義の観点から言語モデル (LM) を研究することで、その能力と限界を正確に特徴付けることができます。
以前の研究では、重み付けされていない形式言語を認識する能力の観点からリカレントニューラルネットワーク (RNN) LM の表現能力を調査しました。
ただし、LM は重みのない形式言語を記述するのではなく、文字列に対する \emph{確率分布} を定義します。
この研究では、RNN LM がそのような確率分布のどのクラスを表現できるかを研究します。これにより、RNN LM の機能についてより直接的に述べることができます。
我々は、単純な RNN が確率的有限状態オートマトンのサブクラスと同等であるため、有限状態モデルで表現可能な確率分布の厳密なサブセットをモデル化できることを示します。
さらに、RNN を使用して有限状態 LM を表現する場合の空間の複雑さを研究します。
アルファベット $\alphabet$ 上の $N$ 状態を持つ任意の決定論的有限状態 LM を表現するには、RNN が $\Omega\left(N |\Sigma|\right)$ ニューロンを必要とすることを示します。
これらの結果は、RNN LM が表現できる分布のクラスを特徴付けるための最初のステップを示しており、RNN LM の機能と制限を理解するのに役立ちます。

要約(オリジナル)

Studying language models (LMs) in terms of well-understood formalisms allows us to precisely characterize their abilities and limitations. Previous work has investigated the representational capacity of recurrent neural network (RNN) LMs in terms of their capacity to recognize unweighted formal languages. However, LMs do not describe unweighted formal languages — rather, they define \emph{probability distributions} over strings. In this work, we study what classes of such probability distributions RNN LMs can represent, which allows us to make more direct statements about their capabilities. We show that simple RNNs are equivalent to a subclass of probabilistic finite-state automata, and can thus model a strict subset of probability distributions expressible by finite-state models. Furthermore, we study the space complexity of representing finite-state LMs with RNNs. We show that, to represent an arbitrary deterministic finite-state LM with $N$ states over an alphabet $\alphabet$, an RNN requires $\Omega\left(N |\Sigma|\right)$ neurons. These results present a first step towards characterizing the classes of distributions RNN LMs can represent and thus help us understand their capabilities and limitations.

arxiv情報

著者	Anej Svete,Ryan Cotterell
発行日	2023-12-19 10:13:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Recurrent Neural Language Models as Probabilistic Finite-state Automata

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー