Dependency Parsing is More Parameter-Efficient with Normalization

要約

依存関係解析は、自然言語構造を推測するタスクであり、しばしばBiaffineスコアリングを介して注意を払って単語の相互作用をモデル化することによってアプローチされます。
このメカニズムは、変圧器の自己関節のように機能します。ここでは、文の単語のペアごとにスコアが計算されます。
ただし、変圧器の注意とは異なり、Biaffineスコアリングは、スコアのソフトマックスを取得する前に正規化を使用しません。
このホワイトペーパーでは、正規化の欠如が必然的にオーバーパラメーター化されたパーサーモデルをもたらすことを明らかにする理論的証拠と経験的結果を提供します。ここでは、追加のパラメーターがバイフィーフィンスコアリング機能への高い分散入力によって生成されるシャープなソフトマックス出力を補正します。
スコアの正規化を実行することにより、Biaffineスコアリングを実質的に効率的にすることができると主張します。
1つのホップパーサーを使用して、セマンティックおよび構文の依存関係解析のために、6つのデータセットで実験を実施します。
n-layerの積み重ねられたbilstmsを訓練し、二倍性スコアを正規化する場合とそれなしでパーサーのパフォーマンスを評価します。
正規化により、2つのデータセットで最先端を打ち負かすことができ、サンプルが少なく、トレーニング可能なパラメーターが少なくなります。
コード：https：//anonymous.4open.science/r/efficientsdp-70c1

要約(オリジナル)

Dependency parsing is the task of inferring natural language structure, often approached by modeling word interactions via attention through biaffine scoring. This mechanism works like self-attention in Transformers, where scores are calculated for every pair of words in a sentence. However, unlike Transformer attention, biaffine scoring does not use normalization prior to taking the softmax of the scores. In this paper, we provide theoretical evidence and empirical results revealing that a lack of normalization necessarily results in overparameterized parser models, where the extra parameters compensate for the sharp softmax outputs produced by high variance inputs to the biaffine scoring function. We argue that biaffine scoring can be made substantially more efficient by performing score normalization. We conduct experiments on six datasets for semantic and syntactic dependency parsing using a one-hop parser. We train N-layer stacked BiLSTMs and evaluate the parser’s performance with and without normalizing biaffine scores. Normalizing allows us to beat the state of the art on two datasets, with fewer samples and trainable parameters. Code: https://anonymous.4open.science/r/EfficientSDP-70C1

arxiv情報

著者	Paolo Gajo,Domenic Rosati,Hassan Sajjad,Alberto Barrón-Cedeño
発行日	2025-05-26 16:56:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Dependency Parsing is More Parameter-Efficient with Normalization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー