SemEval-2017 Task 4: Sentiment Analysis in Twitter using BERT

要約

この論文では、トランスフォーマーベースのアーキテクチャである BERT モデルを使用して、SemEval2017 のタスク 4A、英語、Twitter での感情分析を解決します。
BERT は、トレーニングデータの量が少ない場合の分類タスク用の非常に強力な大規模言語モデルです。
この実験では、12 の隠れ層を持つ BERT{\textsubscript{\tiny BASE}} モデルを使用しました。
このモデルは、Naive Bayes ベースラインモデルよりも優れた精度、精度、再現率、および f1 スコアを提供します。
多クラス分類サブタスクよりもバイナリ分類サブタスクの方がパフォーマンスが優れています。
Twitter データには個人情報や賢明な情報が含まれているため、この実験ではあらゆる種類の倫理的問題も考慮しました。
実験で使用したデータセットとコードは、この GitHub リポジトリにあります。

要約(オリジナル)

This paper uses the BERT model, which is a transformer-based architecture, to solve task 4A, English Language, Sentiment Analysis in Twitter of SemEval2017. BERT is a very powerful large language model for classification tasks when the amount of training data is small. For this experiment, we have used the BERT{\textsubscript{\tiny BASE}} model, which has 12 hidden layers. This model provides better accuracy, precision, recall, and f1 score than the Naive Bayes baseline model. It performs better in binary classification subtasks than the multi-class classification subtasks. We also considered all kinds of ethical issues during this experiment, as Twitter data contains personal and sensible information. The dataset and code used in our experiment can be found in this GitHub repository.

arxiv情報

著者	Rupak Kumar Das,Dr. Ted Pedersen
発行日	2024-01-15 20:17:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SemEval-2017 Task 4: Sentiment Analysis in Twitter using BERT

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー