Dependency Structure Augmented Contextual Scoping Framework for Multimodal Aspect-Based Sentiment Analysis

要約

マルチモーダルアスペクトベースのセンチメント分析（MABSA）は、画像テキストペアからきめの細かい情報を抽出して、アスペクト用語を特定し、その感情の極性を決定しようとしています。
ただし、既存のアプローチは、センチメントキュー知覚（SCP）、マルチモーダル情報の不整合（MIM）、およびセマンティックノイズエリミネーション（SNE）の3つのコアの課題に同時に対処することに同時に不足していることがよくあります。
これらの制限を克服するために、Dasco（\ textbf {d} ependency structure \ textbf {a} ugented \ textbf {sco} pingフレームワーク）を提案します。
まず、ベースモデルでMABSAのマルチタスク事前削除戦略を設計し、アスペクト指向の強化、画像テキストマッチング、およびアスペクトレベルのセンチメントに敏感な認知を組み合わせました。
これにより、SCPやMIMなどの重要な課題に対処しながら、効果的な画像テキストアラインメントを達成しながら、アスペクト用語と感情の手がかりに対するモデルの認識が改善されました。
さらに、依存性ツリーを構文分岐とセマンティックブランチを組み合わせて組み込み、ターゲット固有の範囲内で重要なコンテキスト要素に選択的に注意を払うようにモデルを導きながら、SNE問題に対処するための無関係なノイズを効果的に除外します。
3つのサブタスクにわたる2つのベンチマークデータセットでの広範な実験は、DascoがMabsaで最先端のパフォーマンスを達成し、JMASAで顕著な利益を得ていることを示しています（+3.1 \％F1および+5.4 \％精度でTwitter2015）。

要約(オリジナル)

Multimodal Aspect-Based Sentiment Analysis (MABSA) seeks to extract fine-grained information from image-text pairs to identify aspect terms and determine their sentiment polarity. However, existing approaches often fall short in simultaneously addressing three core challenges: Sentiment Cue Perception (SCP), Multimodal Information Misalignment (MIM), and Semantic Noise Elimination (SNE). To overcome these limitations, we propose DASCO (\textbf{D}ependency Structure \textbf{A}ugmented \textbf{Sco}ping Framework), a fine-grained scope-oriented framework that enhances aspect-level sentiment reasoning by leveraging dependency parsing trees. First, we designed a multi-task pretraining strategy for MABSA on our base model, combining aspect-oriented enhancement, image-text matching, and aspect-level sentiment-sensitive cognition. This improved the model’s perception of aspect terms and sentiment cues while achieving effective image-text alignment, addressing key challenges like SCP and MIM. Furthermore, we incorporate dependency trees as syntactic branch combining with semantic branch, guiding the model to selectively attend to critical contextual elements within a target-specific scope while effectively filtering out irrelevant noise for addressing SNE problem. Extensive experiments on two benchmark datasets across three subtasks demonstrate that DASCO achieves state-of-the-art performance in MABSA, with notable gains in JMASA (+3.1\% F1 and +5.4\% precision on Twitter2015).

arxiv情報

著者	Hao Liu,Lijun He,Jiaxi Liang,Zhihan Ren,Fan Li
発行日	2025-04-15 16:05:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Dependency Structure Augmented Contextual Scoping Framework for Multimodal Aspect-Based Sentiment Analysis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー