Does Multiple Choice Have a Future in the Age of Generative AI? A Posttest-only RCT

要約

効果的な学習ツールとしての多肢選択問題 (MCQ) の役割については、過去の研究で議論されてきました。
MCQ は採点が容易なため広く使用されていますが、自動採点のための大規模言語モデル (LLM) の進歩により、自由回答形式の質問が指導に使用されることが増えています。
この研究では、学習における MCQ の有効性を自由回答形式の質問と個別におよび組み合わせて評価します。
これらの活動は、権利擁護に関する 6 つの家庭教師のレッスンに組み込まれています。
事後テストのみのランダム化制御設計を使用して、MCQ のみ、オープンレスポンスのみ、および両方の組み合わせの 3 つの条件にわたって 234 人の講師 (790 レッスン完了) のパフォーマンスを比較します。
事後テストでは、条件によって学習に大きな差は見られませんでしたが、MCQ 条件の講師は指導を完了するのにかかる時間が大幅に短くなりました。
これらの発見は、MCQ は、練習時間が限られている場合の学習において、自由回答タスクと同じくらい効果的かつ効率的であることを示唆しています。
効率をさらに高めるために、GPT-4o および GPT-4-turbo を使用してオープン応答を自動評価しました。
GPT モデルは、リスクの少ない評価の目的での熟練度を示していますが、より広範囲に使用するにはさらなる研究が必要です。
この研究は、透明性と再現性を促進するために、授業ログデータ、人間による注釈ルーブリック、LLM プロンプトのデータセットに貢献しています。

要約(オリジナル)

The role of multiple-choice questions (MCQs) as effective learning tools has been debated in past research. While MCQs are widely used due to their ease in grading, open response questions are increasingly used for instruction, given advances in large language models (LLMs) for automated grading. This study evaluates MCQs effectiveness relative to open-response questions, both individually and in combination, on learning. These activities are embedded within six tutor lessons on advocacy. Using a posttest-only randomized control design, we compare the performance of 234 tutors (790 lesson completions) across three conditions: MCQ only, open response only, and a combination of both. We find no significant learning differences across conditions at posttest, but tutors in the MCQ condition took significantly less time to complete instruction. These findings suggest that MCQs are as effective, and more efficient, than open response tasks for learning when practice time is limited. To further enhance efficiency, we autograded open responses using GPT-4o and GPT-4-turbo. GPT models demonstrate proficiency for purposes of low-stakes assessment, though further research is needed for broader use. This study contributes a dataset of lesson log data, human annotation rubrics, and LLM prompts to promote transparency and reproducibility.

arxiv情報

著者	Danielle R. Thomas,Conrad Borchers,Sanjit Kakarla,Jionghao Lin,Shambhavi Bhushan,Boyuan Guo,Erin Gatz,Kenneth R. Koedinger
発行日	2024-12-13 16:37:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Does Multiple Choice Have a Future in the Age of Generative AI? A Posttest-only RCT

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー