Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022

要約

このレポートでは、EPIC-KITCHEN-100 マルチインスタンス検索チャレンジ 2022 に対するアプローチを紹介します。まず、文を動詞と名詞に対応する意味役割に解析します。
次に、セルフアテンションを利用して、複数の埋め込み空間でのトリプレット損失を介してテキスト特徴とともに意味論的役割の文脈化されたビデオ特徴を活用します。
私たちの方法は、意味上の類似性にとってより価値のある正規化割引累積ゲイン (nDCG) の強力なベースラインを上回ります。
私たちの投稿は、nDCG で 3 位、mAP で 4 位にランクされています。

要約(オリジナル)

In this report, we present our approach for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022. We first parse sentences into semantic roles corresponding to verbs and nouns; then utilize self-attentions to exploit semantic role contextualized video features along with textual features via triplet losses in multiple embedding spaces. Our method overpasses the strong baseline in normalized Discounted Cumulative Gain (nDCG), which is more valuable for semantic similarity. Our submission is ranked 3rd for nDCG and ranked 4th for mAP.

arxiv情報

著者	Burak Satar,Hongyuan Zhu,Hanwang Zhang,Joo Hwee Lim
発行日	2023-09-26 14:27:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー