Do Multi-Document Summarization Models Synthesize?

要約

複数の文書の要約には、入力のコレクションの簡潔な概要の作成が必要です。
一部のアプリケーションでは、あらすじは重要な側面に関する入力を正確に合成する必要があります。たとえば、特定の映画について書かれた映画レビューのあらすじは、批評家の平均的なコンセンサスを反映している必要があります。
より重要な例として、臨床試験結果の生物医学的体系的レビューに付随する物語の要約は、個々の試験からの潜在的に矛盾する結果を正確に要約する必要があります。
この論文では、現代の複数文書要約モデルはこの種の合成をどの程度暗黙的に実行しているのか、ということを問います。
私たちは、微調整されたトランスフォーマーから GPT-4 までの一連の要約モデルを使用して、意見と証拠の合成データセットに対して実験を実行します。
既存のモデルは部分的には合成を実行しますが、不完全であることがわかりました。最高のパフォーマンスを発揮するモデルであっても、入力順序の変化には過剰に敏感であり、入力構成の変化（たとえば、肯定的なレビューと否定的なレビューの比率）には鈍感です。
私たちは、明示的に多様な候補出力のセットを生成し、その中から入力の予想される集計尺度に最もよく一致する文字列を選択するか、モデルが良い結果を生成しない場合には棄権することにより、モデル合成機能を向上させるためのシンプルで一般的で効果的な方法を提案します。
候補者。

要約(オリジナル)

Multi-document summarization entails producing concise synopses of collections of inputs. For some applications, the synopsis should accurately synthesize inputs with respect to a key aspect, e.g., a synopsis of film reviews written about a particular movie should reflect the average critic consensus. As a more consequential example, narrative summaries that accompany biomedical systematic reviews of clinical trial results should accurately summarize the potentially conflicting results from individual trials. In this paper we ask: To what extent do modern multi-document summarization models implicitly perform this sort of synthesis? We run experiments over opinion and evidence synthesis datasets using a suite of summarization models, from fine-tuned transformers to GPT-4. We find that existing models partially perform synthesis, but imperfectly: even the best performing models are over-sensitive to changes in input ordering and under-sensitive to changes in input compositions (e.g., ratio of positive to negative reviews). We propose a simple, general, effective method for improving model synthesis capabilities by generating an explicitly diverse set of candidate outputs, and then selecting from these the string best aligned with the expected aggregate measure for the inputs, or abstaining when the model produces no good candidate.

arxiv情報

著者	Jay DeYoung,Stephanie C. Martinez,Iain J. Marshall,Byron C. Wallace
発行日	2024-07-12 14:24:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Do Multi-Document Summarization Models Synthesize?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー