Distinguishing Fictional Voices: a Study of Authorship Verification Models for Quotation Attribution


この研究では、英語小説の大規模なコーパス (Project Dialogism Novel Corpus) 内の既製の事前トレーニング済み著者検証モデルを使用して引用文をエンコードすることによって構築された登場人物の文体表現を調査します。
ただし、これらの結果は小説によって異なるため、特に文学テキストや登場人物の研究に合わせて調整されたスタイロメトリック モデルのさらなる調査が行われる必要があります。


Recent approaches to automatically detect the speaker of an utterance of direct speech often disregard general information about characters in favor of local information found in the context, such as surrounding mentions of entities. In this work, we explore stylistic representations of characters built by encoding their quotes with off-the-shelf pretrained Authorship Verification models in a large corpus of English novels (the Project Dialogism Novel Corpus). Results suggest that the combination of stylistic and topical information captured in some of these models accurately distinguish characters among each other, but does not necessarily improve over semantic-only models when attributing quotes. However, these results vary across novels and more investigation of stylometric models particularly tailored for literary texts and the study of characters should be conducted.


著者 Gaspard Michel,Elena V. Epure,Romain Hennequin,Christophe Cerisara
発行日 2024-01-30 12:49:40+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CL パーマリンク