Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues

要約

Vision-Language Tracking (VLT) は、視覚的なテンプレートと言語の説明を使用して、ビデオシーケンス内のターゲットの位置を特定することを目的としています。
テキストの手がかりは追跡の可能性を高めますが、現在のデータセットには通常、テキストよりもはるかに多くの画像データが含まれており、2 つのモダリティを効果的に調整する VLT 手法の能力が制限されています。
この不均衡に対処するために、基礎接地モデルの強力なテキストと画像の位置合わせ機能を活用する、CTVLT という名前の新しいプラグアンドプレイ手法を提案します。
CTVLT は、テキストキューを解釈可能なビジュアルヒートマップに変換し、トラッカーが処理しやすくします。
具体的には、テキストキューをターゲット分布ヒートマップに変換し、テキストで記述された位置を視覚的に表現するテキストキューマッピングモジュールを設計します。
さらに、ヒートマップガイダンスモジュールは、これらのヒートマップと検索画像を融合して、追跡をより効果的にガイドします。
主流のベンチマークに関する広範な実験により、当社のアプローチの有効性が実証され、最先端のパフォーマンスが達成され、強化された VLT に対する当社の手法の有用性が検証されました。

要約(オリジナル)

Vision-Language Tracking (VLT) aims to localize a target in video sequences using a visual template and language description. While textual cues enhance tracking potential, current datasets typically contain much more image data than text, limiting the ability of VLT methods to align the two modalities effectively. To address this imbalance, we propose a novel plug-and-play method named CTVLT that leverages the strong text-image alignment capabilities of foundation grounding models. CTVLT converts textual cues into interpretable visual heatmaps, which are easier for trackers to process. Specifically, we design a textual cue mapping module that transforms textual cues into target distribution heatmaps, visually representing the location described by the text. Additionally, the heatmap guidance module fuses these heatmaps with the search image to guide tracking more effectively. Extensive experiments on mainstream benchmarks demonstrate the effectiveness of our approach, achieving state-of-the-art performance and validating the utility of our method for enhanced VLT.

arxiv情報

著者	X. Feng,D. Zhang,S. Hu,X. Li,M. Wu,J. Zhang,X. Chen,K. Huang
発行日	2024-12-27 13:54:32+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー