Open-Vocabulary Federated Learning with Multimodal Prototyping

要約

既存の連合学習 (FL) 研究では、通常、トレーニングラベル空間とテストラベル空間が同一であると想定されています。
ただし、現実のアプリケーションでは、この仮定は理想的すぎて真実ではありません。
新しいユーザーは、目に見えないクラスからのデータを含むクエリを思いつく可能性があり、そのようなオープン語彙クエリは、そのような FL システムに直接欠陥を与えることになります。
したがって、この研究では、フロリダ州で十分に研究されていないオープンボキャブラリーの課題に明確に焦点を当てています。
つまり、新規ユーザーの場合、グローバルサーバーは、任意の未知のクラスを含むユーザーのクエリを理解する必要があります。
この問題に対処するために、事前トレーニングされたビジョン言語モデル (VLM) を活用します。
特に、Federated Multimodal Prototyping (Fed-MP) と呼ばれる、FL のコンテキストにおける VLM に合わせた新しい適応フレームワークを紹介します。
Fed-MP は、軽量クライアント残差に基づいてローカルモデルの重みを適応的に集約し、新しいマルチモーダルプロトタイピングメカニズムに基づいて予測を行います。
Fed-MP は、目に見えるクラスから学習した知識を活用し、目に見えないカテゴリに適応した VLM を強化します。
さまざまなデータセットに対する私たちの経験的評価により、Fed-MP の有効性が検証されています。

要約(オリジナル)

Existing federated learning (FL) studies usually assume the training label space and test label space are identical. However, in real-world applications, this assumption is too ideal to be true. A new user could come up with queries that involve data from unseen classes, and such open-vocabulary queries would directly defect such FL systems. Therefore, in this work, we explicitly focus on the under-explored open-vocabulary challenge in FL. That is, for a new user, the global server shall understand her/his query that involves arbitrary unknown classes. To address this problem, we leverage the pre-trained vision-language models (VLMs). In particular, we present a novel adaptation framework tailored for VLMs in the context of FL, named as Federated Multimodal Prototyping (Fed-MP). Fed-MP adaptively aggregates the local model weights based on light-weight client residuals, and makes predictions based on a novel multimodal prototyping mechanism. Fed-MP exploits the knowledge learned from the seen classes, and robustifies the adapted VLM to unseen categories. Our empirical evaluation on various datasets validates the effectiveness of Fed-MP.

arxiv情報

著者	Huimin Zeng,Zhenrui Yue,Dong Wang
発行日	2024-04-02 15:03:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Open-Vocabulary Federated Learning with Multimodal Prototyping

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー