G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios

要約

最新の情報クエリシステムには、視覚や音声などのマルチモーダル入力が徐々に組み込まれています。
しかし、視線の統合（ユーザーの意図と深く関係しており、視線追跡ウェアラブル経由でますますアクセスしやすくなっているモダリティ）については、まだ研究が進んでいません。
この論文では、ユーザーの視線、視野、音声ベースの自然言語クエリを相乗して、より直観的なクエリプロセスを促進する、G-VOILA と呼ばれる新しい視線促進型情報クエリパラダイムを紹介します。
3 つの毎日のシナリオ (p = 21、シーン = 3) における 21 人の参加者を対象としたユーザー実行研究では、ユーザーのクエリ言語の曖昧さと、G-VOILA を使用したユーザーの自然なクエリ行動における視線と音声の協調パターンが明らかになりました。
定量的および定性的な調査結果に基づいて、視線データと現場のクエリコンテキストを効果的に統合する G-VOILA パラダイムの設計フレームワークを開発しました。
次に、最先端の深層学習技術を使用した G-VOILA の概念実証を実装しました。
追跡ユーザー調査 (p = 16、シーン = 2) では、視線データのないベースラインと比較して、より高い客観的スコアと主観的スコアの両方を達成することで、その有効性が実証されています。
さらにインタビューを実施し、将来の視線誘導型情報照会システムに関する洞察を提供しました。

要約(オリジナル)

Modern information querying systems are progressively incorporating multimodal inputs like vision and audio. However, the integration of gaze — a modality deeply linked to user intent and increasingly accessible via gaze-tracking wearables — remains underexplored. This paper introduces a novel gaze-facilitated information querying paradigm, named G-VOILA, which synergizes users’ gaze, visual field, and voice-based natural language queries to facilitate a more intuitive querying process. In a user-enactment study involving 21 participants in 3 daily scenarios (p = 21, scene = 3), we revealed the ambiguity in users’ query language and a gaze-voice coordination pattern in users’ natural query behaviors with G-VOILA. Based on the quantitative and qualitative findings, we developed a design framework for the G-VOILA paradigm, which effectively integrates the gaze data with the in-situ querying context. Then we implemented a G-VOILA proof-of-concept using cutting-edge deep learning techniques. A follow-up user study (p = 16, scene = 2) demonstrates its effectiveness by achieving both higher objective score and subjective score, compared to a baseline without gaze data. We further conducted interviews and provided insights for future gaze-facilitated information querying systems.

arxiv情報

著者	Zeyu Wang,Yuanchun Shi,Yuntao Wang,Yuchen Yao,Kun Yan,Yuhan Wang,Lei Ji,Xuhai Xu,Chun Yu
発行日	2024-05-13 11:24:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー