TSST: A Benchmark and Evaluation Models for Text Speech-Style Transfer

要約

テキストのスタイルは、話者の特性、習慣、論理的思考、および話者の表現内容のさまざまな側面を包含するため、非常に抽象的です。
しかし、これまでのテキストスタイルの転送タスクは主にデータ駆動型のアプローチに焦点を当てており、言語学や認知科学の観点からの詳細な分析や研究が不足していました。
この論文では、Text Speech-Style Transfer (TSST) と呼ばれる新しいタスクを紹介します。
主な目的は、既存の LLM の機能に基づいて、性格や感情などの人間の認知に関連するトピックをさらに調査することです。
私たちのタスクの目的と現実のシナリオにおける口頭音声の独特の特徴を考慮して、TSST の多次元 (つまり、つなぎ言葉、鮮やかさ、対話性、感情性) 評価モデルをトレーニングし、人間の評価との相関関係を検証しました。
私たちは、いくつかの大規模言語モデル (LLM) のパフォーマンスを徹底的に分析し、さらなる改善が必要な領域を特定します。
さらに、評価モデルに基づいて、音声スタイルの特徴を持つテキストを生成する LLM の機能を向上させる新しいコーパスをリリースしました。
要約すると、スタイル転送の新しいベンチマークであり、人間指向の評価を重視し、現在の LLM のパフォーマンスを調査および向上させる TSST タスクを紹介します。

要約(オリジナル)

Text style is highly abstract, as it encompasses various aspects of a speaker’s characteristics, habits, logical thinking, and the content they express. However, previous text-style transfer tasks have primarily focused on data-driven approaches, lacking in-depth analysis and research from the perspectives of linguistics and cognitive science. In this paper, we introduce a novel task called Text Speech-Style Transfer (TSST). The main objective is to further explore topics related to human cognition, such as personality and emotion, based on the capabilities of existing LLMs. Considering the objective of our task and the distinctive characteristics of oral speech in real-life scenarios, we trained multi-dimension (i.e. filler words, vividness, interactivity, emotionality) evaluation models for the TSST and validated their correlation with human assessments. We thoroughly analyze the performance of several large language models (LLMs) and identify areas where further improvement is needed. Moreover, driven by our evaluation models, we have released a new corpus that improves the capabilities of LLMs in generating text with speech-style characteristics. In summary, we present the TSST task, a new benchmark for style transfer and emphasizing human-oriented evaluation, exploring and advancing the performance of current LLMs.

arxiv情報

著者	Huashan Sun,Yixiao Wu,Yinghao Li,Jiawei Li,Yizhe Yang,Yang Gao
発行日	2023-11-14 18:50:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

TSST: A Benchmark and Evaluation Models for Text Speech-Style Transfer

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー