Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students’ (Mis)Understanding Is Hinted

要約

この研究の主な目標は、事前に訓練された大規模な言語モデルを使用して複数選択の質問（MCQ）を生成するために、革新的なプロンプト技術であるAnaquestを開発および評価することです。
Anaquestでは、選択項目は複雑な概念に関する文レベルの主張です。
この手法は、形成的評価と総合評価を統合します。
形成段階では、学生は無料のテキストのターゲット概念の自由回答形式の質問に答えます。
総合評価のために、Anaquestはこれらの応答を分析して、正しいアサーションと誤ったアサーションの両方を生成します。
生成されたMCQの妥当性を評価するために、アイテム応答理論（IRT）を適用して、Anaquest、BaseLine ChatGPTプロンプト、および人為的なアイテムによって生成されたMCQ間のアイテム特性を比較しました。
経験的研究では、専門家のインストラクターは、両方のAIモデルによって生成されたMCQを、人間のインストラクターが作成したものと同じくらい有効であると評価していることがわかりました。
しかし、IRTベースの分析により、Anaquest生成された質問、特に誤ったアサーション（フォイル）がある質問 – は、ChatGptが生み出したものよりも難易度と差別の観点から、人間が作成したアイテムに類似していることが明らかになりました。

要約(オリジナル)

The primary goal of this study is to develop and evaluate an innovative prompting technique, AnaQuest, for generating multiple-choice questions (MCQs) using a pre-trained large language model. In AnaQuest, the choice items are sentence-level assertions about complex concepts. The technique integrates formative and summative assessments. In the formative phase, students answer open-ended questions for target concepts in free text. For summative assessment, AnaQuest analyzes these responses to generate both correct and incorrect assertions. To evaluate the validity of the generated MCQs, Item Response Theory (IRT) was applied to compare item characteristics between MCQs generated by AnaQuest, a baseline ChatGPT prompt, and human-crafted items. An empirical study found that expert instructors rated MCQs generated by both AI models to be as valid as those created by human instructors. However, IRT-based analysis revealed that AnaQuest-generated questions – particularly those with incorrect assertions (foils) – more closely resembled human-crafted items in terms of difficulty and discrimination than those produced by ChatGPT.

arxiv情報

著者	Machi Shimmei,Masaki Uto,Yuichiroh Matsubayashi,Kentaro Inui,Aditi Mallavarapu,Noboru Matsuda
発行日	2025-05-09 06:33:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students’ (Mis)Understanding Is Hinted

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー