Probing Internal Representations of Multi-Word Verbs in Large Language Models

要約

この研究では、変圧器ベースの大手言語モデル（LLM）内のマルチワード動詞と呼ばれる動詞粒子の組み合わせの内部表現を調査し、これらのモデルが異なるニューラルネットワーク層で語彙的および構文特性をキャプチャする方法を具体的に調べます。
Bertアーキテクチャを使用して、2つの異なる動詞粒子構造の層の表現を分析します。「あきらめる」などの句動詞と「見てください」のような前置詞動詞です。
私たちの方法論には、内部表現で分類を調査するトレーニングを含むために、これらのカテゴリを単語レベルと文レベルの両方で分類することが含まれます。
結果は、モデルの中間層が最高の分類精度を達成することを示しています。
これらの区別の性質をさらに分析するために、一般化識別値（GDV）を使用してデータ分離性テストを実施します。
GDVの結果は2つの動詞タイプ間で弱い線形分離性を示しますが、プロービング分類器は依然として高精度を達成し、これらの言語カテゴリの表現は非線形的に分離可能である可能性があることを示唆しています。
これは、ニューラルネットワークにおける言語の区別が常に直線的に分離可能な方法でエンコードされるとは限らないことを示す以前の研究と一致しています。
これらの調査結果は、動詞粒子構造の表現に関する使用法ベースの主張を計算し、ニューラルネットワークアーキテクチャと言語構造の間の複雑な相互作用を強調しています。

要約(オリジナル)

This study investigates the internal representations of verb-particle combinations, called multi-word verbs, within transformer-based large language models (LLMs), specifically examining how these models capture lexical and syntactic properties at different neural network layers. Using the BERT architecture, we analyze the representations of its layers for two different verb-particle constructions: phrasal verbs like ‘give up’ and prepositional verbs like ‘look at’. Our methodology includes training probing classifiers on the internal representations to classify these categories at both word and sentence levels. The results indicate that the model’s middle layers achieve the highest classification accuracies. To further analyze the nature of these distinctions, we conduct a data separability test using the Generalized Discrimination Value (GDV). While GDV results show weak linear separability between the two verb types, probing classifiers still achieve high accuracy, suggesting that representations of these linguistic categories may be non-linearly separable. This aligns with previous research indicating that linguistic distinctions in neural networks are not always encoded in a linearly separable manner. These findings computationally support usage-based claims on the representation of verb-particle constructions and highlight the complex interaction between neural network architectures and linguistic structures.

arxiv情報

著者	Hassane Kissane,Achim Schilling,Patrick Krauss
発行日	2025-02-07 09:49:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Probing Internal Representations of Multi-Word Verbs in Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー