StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

要約

著者の身元を意図的に不明瞭にするためにテキストを書き換える著者名難読化は、重要ですが困難な作業です。
大規模言語モデル (LLM) を使用する現在の方法は、解釈性と制御性に欠けており、作成者固有の文体の特徴を無視することが多く、その結果、全体的なパフォーマンスが低くなります。
これに対処するために、元の入力テキストの特定のきめ細かいスタイル要素を混乱させる、適応的で解釈可能な難読化手法である StyleRemix を開発しました。
StyleRemix は、事前トレーニング済みの低ランク適応 (LoRA) モジュールを使用して、低計算コストを維持しながら、特にさまざまなスタイル軸 (形式や長さなど) に沿って入力を書き換えます。
StyleRemix は、自動評価と人間による評価の両方で評価されるように、さまざまなドメインで最先端のベースラインやはるかに大規模な LLM よりも優れたパフォーマンスを発揮します。
さらに、14 人の著者と 4 つのドメインからなる多様なセットからの 30,000 の高品質な長文テキストの大規模なセットである AuthorMix と、16 の独自の方向で 7 つのスタイル軸にわたる 1,500 のテキストの並列コーパスである DiSC をリリースします。

要約(オリジナル)

Authorship obfuscation, rewriting a text to intentionally obscure the identity of the author, is an important but challenging task. Current methods using large language models (LLMs) lack interpretability and controllability, often ignoring author-specific stylistic features, resulting in less robust performance overall. To address this, we develop StyleRemix, an adaptive and interpretable obfuscation method that perturbs specific, fine-grained style elements of the original input text. StyleRemix uses pre-trained Low Rank Adaptation (LoRA) modules to rewrite an input specifically along various stylistic axes (e.g., formality and length) while maintaining low computational cost. StyleRemix outperforms state-of-the-art baselines and much larger LLMs in a variety of domains as assessed by both automatic and human evaluation. Additionally, we release AuthorMix, a large set of 30K high-quality, long-form texts from a diverse set of 14 authors and 4 domains, and DiSC, a parallel corpus of 1,500 texts spanning seven style axes in 16 unique directions

arxiv情報

著者	Jillian Fisher,Skyler Hallinan,Ximing Lu,Mitchell Gordon,Zaid Harchaoui,Yejin Choi
発行日	2024-08-28 09:35:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー