MinorBench: A hand-built benchmark for content-based risks for children

要約

大規模な言語モデル（LLM）は、親主導の養子縁組、学校、およびピアネットワークを通じて、子供の生活に急速に入り込んでいますが、現在のAI倫理と安全性の研究は、未成年者に固有のコンテンツ関連のリスクに適切に対処していません。
この論文では、中学校の環境に展開されたLLMベースのチャットボットの実際のケーススタディでこれらのギャップを強調し、生徒がシステムをどのように使用し、時には誤用したかを明らかにします。
これらの調査結果に基づいて、私たちは未成年者のコンテンツベースのリスクの新しい分類法を提案し、MinorBenchを導入します。これは、子供から安全で不適切なクエリを拒否する能力についてLLMSを評価するために設計されたオープンソースのベンチマークです。
さまざまなシステムプロンプトの下で6つの顕著なLLMを評価し、子どもの安全性のコンプライアンスに大きなばらつきが示されています。
私たちの結果は、より堅牢で子供に焦点を当てた安全メカニズムのための実用的な手順に情報を提供し、若いユーザーを保護するためにAIシステムを調整する緊急性を強調しています。

要約(オリジナル)

Large Language Models (LLMs) are rapidly entering children’s lives – through parent-driven adoption, schools, and peer networks – yet current AI ethics and safety research do not adequately address content-related risks specific to minors. In this paper, we highlight these gaps with a real-world case study of an LLM-based chatbot deployed in a middle school setting, revealing how students used and sometimes misused the system. Building on these findings, we propose a new taxonomy of content-based risks for minors and introduce MinorBench, an open-source benchmark designed to evaluate LLMs on their ability to refuse unsafe or inappropriate queries from children. We evaluate six prominent LLMs under different system prompts, demonstrating substantial variability in their child-safety compliance. Our results inform practical steps for more robust, child-focused safety mechanisms and underscore the urgency of tailoring AI systems to safeguard young users.

arxiv情報

著者	Shaun Khoo,Gabriel Chua,Rachel Shong
発行日	2025-03-13 10:34:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MinorBench: A hand-built benchmark for content-based risks for children

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー