Análise de ambiguidade linguística em modelos de linguagem de grande escala (LLMs)

要約

Transformers や BERT などのアーキテクチャの進歩にもかかわらず、言語のあいまいさは、自然言語処理 (NLP) システムにとって依然として大きな課題となっています。
ChatGPT や Gemini (2023 年、人工知能は Bard と呼ばれるようになりました) などの教育モデルの最近の成功に触発されたこの研究は、ブラジル系ポルトガル語で一般的な 3 つのタイプに焦点を当て、これらのモデル内の言語の曖昧さを分析し議論することを目的としています。
、語彙の曖昧さ。
分類、説明、曖昧さの解消のために、あいまいな文と明確な文の両方からなる 120 文からなるコーパスを作成します。
曖昧な文を生成するモデルの機能も、曖昧さの種類ごとに文のセットを求めることによって調査されました。
結果は、認識された言語参照を利用した定性分析と、得られた回答の正確さに基づく定量的評価を受けました。
ChatGPT や Gemini などの最も洗練されたモデルでさえ、応答にエラーや欠陥があり、一貫性のない説明が提供されることが多いことが実証されました。
さらに、精度は 49.58 パーセントに達し、教師あり学習には記述的な研究が必要であることが示されました。

要約(オリジナル)

Linguistic ambiguity continues to represent a significant challenge for natural language processing (NLP) systems, notwithstanding the advancements in architectures such as Transformers and BERT. Inspired by the recent success of instructional models like ChatGPT and Gemini (In 2023, the artificial intelligence was called Bard.), this study aims to analyze and discuss linguistic ambiguity within these models, focusing on three types prevalent in Brazilian Portuguese: semantic, syntactic, and lexical ambiguity. We create a corpus comprising 120 sentences, both ambiguous and unambiguous, for classification, explanation, and disambiguation. The models capability to generate ambiguous sentences was also explored by soliciting sets of sentences for each type of ambiguity. The results underwent qualitative analysis, drawing on recognized linguistic references, and quantitative assessment based on the accuracy of the responses obtained. It was evidenced that even the most sophisticated models, such as ChatGPT and Gemini, exhibit errors and deficiencies in their responses, with explanations often providing inconsistent. Furthermore, the accuracy peaked at 49.58 percent, indicating the need for descriptive studies for supervised learning.

arxiv情報

著者	Lavínia de Carvalho Moraes,Irene Cristina Silvério,Rafael Alexandre Sousa Marques,Bianca de Castro Anaia,Dandara Freitas de Paula,Maria Carolina Schincariol de Faria,Iury Cleveston,Alana de Santana Correia,Raquel Meister Ko Freitag
発行日	2024-04-25 14:45:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Análise de ambiguidade linguística em modelos de linguagem de grande escala (LLMs)

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー