Musketeer (All for One, and One for All): A Generalist Vision-Language Model with Task Explanation Prompts

要約

タイトル：Musketeer（一方の為に、そして全ての為に）：タスク説明提示を備えた汎用的なビジョン言語モデル

要約：
– 複数のタスクに共有されるパラメータを持つ、シーケンス対シーケンスのビジョン言語モデル、Musketeerを提案する。
– 新しい機能であるタスク説明提示（TEP）により異種なタスク間の知識の融合が可能となり、モデルは共有構造に集中することができる。
– Musketeerは単一モデルで複数のタスクをほぼ均一に実行し、単一タスクの基準に匹敵または優れた結果を得る。

要約(オリジナル)

We present a sequence-to-sequence vision-language model whose parameters are jointly trained on all tasks (all for one) and fully shared among multiple tasks (one for all), resulting in a single model which we named Musketeer. The integration of knowledge across heterogeneous tasks is enabled by a novel feature called Task Explanation Prompt (TEP). TEP reduces interference among tasks, allowing the model to focus on their shared structure. With a single model, Musketeer achieves results comparable to or better than strong baselines trained on single tasks, almost uniformly across multiple tasks.

arxiv情報

著者	Zhaoyang Zhang,Yantao Shen,Kunyu Shi,Zhaowei Cai,Jun Fang,Siqi Deng,Hao Yang,Davide Modolo,Zhuowen Tu,Stefano Soatto
発行日	2023-05-11 17:57:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Musketeer (All for One, and One for All): A Generalist Vision-Language Model with Task Explanation Prompts

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー