Language Varieties of Italy: Technology Challenges and Opportunities


NLP コミュニティは最近、イタリア言語を含む絶滅危惧言語に取り組み始めました。
この論文では、イタリアの言語的文脈を紹介し、イタリアの言語品種に対する NLP のデフォルトの機械中心の仮定に異議を唱えます。
私たちは、機械中心の NLP から話者中心の NLP へのパラダイムのシフトを提唱し、技術の進歩よりも言語とその話者を優先する取り組みの推奨事項と機会を提供します。


Italy is characterized by a one-of-a-kind linguistic diversity landscape in Europe, which implicitly encodes local knowledge, cultural traditions, artistic expressions and history of its speakers. However, most local languages and dialects in Italy are at risk of disappearing within few generations. The NLP community has recently begun to engage with endangered languages, including those of Italy. Yet, most efforts assume that these varieties are under-resourced language monoliths with an established written form and homogeneous functions and needs, and thus highly interchangeable with each other and with high-resource, standardized languages. In this paper, we introduce the linguistic context of Italy and challenge the default machine-centric assumptions of NLP for Italy’s language varieties. We advocate for a shift in the paradigm from machine-centric to speaker-centric NLP, and provide recommendations and opportunities for work that prioritizes languages and their speakers over technological advances. To facilitate the process, we finally propose building a local community towards responsible, participatory efforts aimed at supporting vitality of languages and dialects of Italy.


著者 Alan Ramponi
発行日 2023-11-20 16:54:55+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, Google

カテゴリー: cs.CL パーマリンク