EmPO: Emotion Grounding for Empathetic Response Generation through Preference Optimization

要約

共感的な応答の生成は、会話エージェントの望ましい側面であり、人間と機械の間の魅力的で感情的にインテリジェントなマルチターン会話を促進するために重要です。
このタスクに大規模な言語モデルを活用すると、有望な結果が得られていますが、共感的な応答の質とモデルの汎化パフォーマンスの保持の両方を確保するという点で課題が残っています。
私たちは、感情グラウンディングに基づいて理論主導の嗜好データセットを構築し、それらを使用して LLM を嗜好最適化アルゴリズムと調整して、これらの課題に対処するという新しいアプローチを提案します。
共感的な反応の生成を評価するために、EmpatheticDialogues データセットを使用し、diff-Epitome および BERTscore メトリックと多次元の人間の評価で共感を評価します。
さらに、特徴ベースの方法を使用して多様性と感情価を測定します。
また、MMLU ベンチマークと Open LLM Leaderboard のタスクを使用して、汎化パフォーマンスに対するトレーニングの影響も評価します。
この結果は、LLM が一般的なパフォーマンスを維持しながら、嗜好の最適化によって共感的な反応を生成できるように調整できること、および感情グラウンディングが嗜好データセットの作成をガイドできることを示しています。
すべてのデータセット、ソースコード、モデルを公開します。
https://github.com/justtherightsize/empo

要約(オリジナル)

Empathetic response generation is a desirable aspect of conversational agents, crucial for facilitating engaging and emotionally intelligent multi-turn conversations between humans and machines. Leveraging large language models for this task has shown promising results, yet challenges persist in ensuring both the empathetic quality of the responses and retention of the generalization performance of the models. We propose a novel approach where we construct theory-driven preference datasets based on emotion grounding and use them to align LLMs with preference optimization algorithms to address these challenges. To evaluate empathetic response generation, we employ the EmpatheticDialogues dataset, assessing empathy with the diff-Epitome and BERTscore metrics and with multi-dimensional human evaluation. Additionally, we measure diversity and emotional valence using feature-based methods. We also evaluate the impact of training on the generalization performance using the MMLU benchmark and tasks from the Open LLM Leaderboard. The results show that LLMs can be aligned for empathetic response generation by preference optimization while retaining their general performance and that emotion grounding can guide preference dataset creation. We make all datasets, source code, and models publicly available. https://github.com/justtherightsize/empo

arxiv情報

著者	Ondrej Sotolar,Vojtech Formanek,Alok Debnath,Allison Lahnala,Charles Welch,Lucie FLek
発行日	2024-09-17 14:24:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

EmPO: Emotion Grounding for Empathetic Response Generation through Preference Optimization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー