Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models

要約

感情分析は、自然言語処理 (NLP) で最も重要なタスクの 1 つであり、意見の極性に基づいてテキストを分類するための機械学習モデルのトレーニングが含まれます。
事前トレーニングされた言語モデル (PLM) は、微調整を通じて下流のタスクに適用できるため、モデルを最初からトレーニングする必要がなくなります。
具体的には、PLM は感情分析 (テキストの感情の極性を検出、分析、抽出するプロセス) に使用されています。
この課題に対処するために数多くのモデルが提案されており、事前トレーニング済みの PhoBERT-V2 モデルはベトナム語の最先端の言語モデルとして際立っています。
PhoBERT-V2 事前トレーニングアプローチは RoBERTa に基づいており、より堅牢なパフォーマンスを実現するために BERT 事前トレーニング方法を最適化しています。
この論文では、ベトナムのレビューの感情分析のために PhoBERT-V2 と SentiWordnet を組み合わせた新しいアプローチを紹介します。
私たちが提案するモデルは、ベトナム語用の PhoBERT-V2 を利用し、ベトナム語のコンテキストで著名な BERT モデルに堅牢な最適化を提供し、センチメント分類アプリケーションをサポートするために明示的に設計された語彙リソースである SentiWordNet を活用します。
VLSP 2016 および AIVIVN 2019 データセットの実験結果は、当社のセンチメント分析システムが他のモデルと比較して優れたパフォーマンスを達成していることを示しています。

要約(オリジナル)

Sentiment analysis is one of the most crucial tasks in Natural Language Processing (NLP), involving the training of machine learning models to classify text based on the polarity of opinions. Pre-trained Language Models (PLMs) can be applied to downstream tasks through fine-tuning, eliminating the need to train the model from scratch. Specifically, PLMs have been employed for Sentiment Analysis, a process that involves detecting, analyzing, and extracting the polarity of text sentiments. Numerous models have been proposed to address this task, with pre-trained PhoBERT-V2 models standing out as the state-of-the-art language models for Vietnamese. The PhoBERT-V2 pre-training approach is based on RoBERTa, optimizing the BERT pre-training method for more robust performance. In this paper, we introduce a novel approach that combines PhoBERT-V2 and SentiWordnet for Sentiment Analysis of Vietnamese reviews. Our proposed model utilizes PhoBERT-V2 for Vietnamese, offering a robust optimization for the prominent BERT model in the context of Vietnamese language, and leverages SentiWordNet, a lexical resource explicitly designed to support sentiment classification applications. Experimental results on the VLSP 2016 and AIVIVN 2019 datasets demonstrate that our sentiment analysis system has achieved excellent performance in comparison to other models.

arxiv情報

著者	Hong-Viet Tran,Van-Tan Bui,Lam-Quan Tran
発行日	2025-01-15 12:22:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー