Multimodal Auto Validation For Self-Refinement in Web Agents

要約

世界がデジタル化するにつれ、複雑で単調なタスクを自動化できる Web エージェントは、ワークフローを合理化するために不可欠なものになりつつあります。
このペーパーでは、マルチモーダル検証と自己調整を通じて Web エージェントのパフォーマンスを向上させるアプローチを紹介します。
最先端の Agent-E Web 自動化フレームワークに基づいて、Web エージェントの自動検証に関するさまざまなモダリティ (テキスト、ビジョン) と階層の効果に関する包括的な研究を紹介します。
また、開発された自動検証ツールを使用した Web オートメーションの自己調整メカニズムも導入し、Web エージェントがワークフローの失敗を検出して自己修正できるようにします。
私たちの結果は、Agent-E (SOTA Web エージェント) の以前の最先端のパフォーマンスが大幅に向上し、WebVoyager ベンチマークのサブセットでタスク完了率が 76.2\% から 81.24\% に向上したことを示しています。
このホワイトペーパーで紹介したアプローチは、複雑な現実世界のシナリオにおいて、より信頼性の高いデジタルアシスタントへの道を開きます。

要約(オリジナル)

As our world digitizes, web agents that can automate complex and monotonous tasks are becoming essential in streamlining workflows. This paper introduces an approach to improving web agent performance through multi-modal validation and self-refinement. We present a comprehensive study of different modalities (text, vision) and the effect of hierarchy for the automatic validation of web agents, building upon the state-of-the-art Agent-E web automation framework. We also introduce a self-refinement mechanism for web automation, using the developed auto-validator, that enables web agents to detect and self-correct workflow failures. Our results show significant gains on Agent-E’s (a SOTA web agent) prior state-of-art performance, boosting task-completion rates from 76.2\% to 81.24\% on the subset of the WebVoyager benchmark. The approach presented in this paper paves the way for more reliable digital assistants in complex, real-world scenarios.

arxiv情報

著者	Ruhana Azam,Tamer Abuelsaad,Aditya Vempaty,Ashish Jagmohan
発行日	2024-10-11 15:42:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multimodal Auto Validation For Self-Refinement in Web Agents

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー