Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning

要約

膨大な量の学術作業が毎日公開されていますが、その多くは密集した専門用語と複雑な言語のために一般の人々にはアクセスできません。
科学コミュニケーションにおけるこの課題に対処するために、学術抽象をより理解できるバージョンに書き直すために言語モデルを微調整する強化学習フレームワークを紹介します。
単語レベルと文レベルのアクセシビリティの報酬の慎重にバランスの取れた組み合わせに導かれ、私たちの言語モデルは、技術用語をよりアクセスしやすい代替品に効果的に置き換えます。
私たちの最良のモデルは、学術抽象の読みやすさレベルを約6つの米国の学年レベル、つまり大学院から高校レベルまで調整します。
これは、事実上の正確さと高品質の言語を維持しながら、監視された微調整ベースラインを約90％相対的に増加させることになります。
私たちのアプローチの詳細な分析は、バランスの取れた報酬が基本モデルの体系的な変更につながり、よりスムーズな最適化と優れたパフォーマンスに寄与する可能性が高いことを示しています。
私たちは、この仕事を、学術研究と一般の人々、特に若い読者と大学の学位のない人々との間のギャップを埋めるための一歩であると考えています。

要約(オリジナル)

A vast amount of scholarly work is published daily, yet much of it remains inaccessible to the general public due to dense jargon and complex language. To address this challenge in science communication, we introduce a reinforcement learning framework that fine-tunes a language model to rewrite scholarly abstracts into more comprehensible versions. Guided by a carefully balanced combination of word- and sentence-level accessibility rewards, our language model effectively substitutes technical terms with more accessible alternatives, a task which models supervised fine-tuned or guided by conventional readability measures struggle to accomplish. Our best model adjusts the readability level of scholarly abstracts by approximately six U.S. grade levels — in other words, from a postgraduate to a high school level. This translates to roughly a 90% relative boost over the supervised fine-tuning baseline, all while maintaining factual accuracy and high-quality language. An in-depth analysis of our approach shows that balanced rewards lead to systematic modifications in the base model, likely contributing to smoother optimization and superior performance. We envision this work as a step toward bridging the gap between scholarly research and the general public, particularly younger readers and those without a college degree.

arxiv情報

著者	Haining Wang,Jason Clark,Hannah McKelvey,Leila Sterman,Zheng Gao,Zuoyu Tian,Sandra Kübler,Xiaozhong Liu
発行日	2025-04-16 16:00:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー