AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context

要約

自閉症と有能主義の理解が増え続けているにつれて、自閉症の人々に対する有能な言語の理解も増え続けています。
このような言語は、その微妙で文脈依存性の性質により、NLP研究で重要な課題をもたらします。
しかし、反自動性のある有能な言語を検出することは、既存のNLPツールが微妙な表現をキャプチャできないことが多いため、未掘削装置のままです。
コンテキストでの反自動性の有能言語の検出に特化した最初のベンチマークデータセットであるAutalicを提示し、フィールドの大きなギャップに対処します。
データセットは、Redditから収集された2,400の自閉症関連の文で構成され、周囲のコンテキストを伴い、神経多様性の背景を持つ訓練された専門家によって注釈が付けられています。
私たちの包括的な評価は、最先端のLLMを含む現在の言語モデルが、反自治の可能性を確実に特定し、人間の判断に合わせて、この領域での制限を強調するのに苦労していることを明らかにしています。
私たちは、Autalicを公開し、個々の注釈を公開します。これは、有能、神経多様性に取り組んでおり、注釈タスクの意見の不一致を研究する研究者にとって貴重なリソースとして役立ちます。
このデータセットは、多様な視点をよりよく反映する、より包括的でコンテキスト認識しているNLPシステムを開発するための重要なステップとして機能します。

要約(オリジナル)

As our understanding of autism and ableism continues to increase, so does our understanding of ableist language towards autistic people. Such language poses a significant challenge in NLP research due to its subtle and context-dependent nature. Yet, detecting anti-autistic ableist language remains underexplored, with existing NLP tools often failing to capture its nuanced expressions. We present AUTALIC, the first benchmark dataset dedicated to the detection of anti-autistic ableist language in context, addressing a significant gap in the field. The dataset comprises 2,400 autism-related sentences collected from Reddit, accompanied by surrounding context, and is annotated by trained experts with backgrounds in neurodiversity. Our comprehensive evaluation reveals that current language models, including state-of-the-art LLMs, struggle to reliably identify anti-autistic ableism and align with human judgments, underscoring their limitations in this domain. We publicly release AUTALIC along with the individual annotations which serve as a valuable resource to researchers working on ableism, neurodiversity, and also studying disagreements in annotation tasks. This dataset serves as a crucial step towards developing more inclusive and context-aware NLP systems that better reflect diverse perspectives.

arxiv情報

著者	Naba Rizvi,Harper Strickland,Daniel Gitelman,Tristan Cooper,Alexis Morales-Flores,Michael Golden,Aekta Kallepalli,Akshat Alurkar,Haaset Owens,Saleha Ahmedi,Isha Khirwadkar,Imani Munyaka,Nedjma Ousidhoum
発行日	2025-04-08 17:08:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー