Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts

要約

インターネット規模のデータで事前トレーニングされた強力なモデルの動作を制御することは、有能なスーパーバイザーが不足しているため難しい場合があります。
最近の研究では、監督上のノイズにもかかわらず、特定の目標に合わせて微調整すると、強い生徒モデルが弱い教師を上回る可能性があることが明らかになりました。
しかし、特に大きな能力ギャップが存在する場合、このような弱から強への一般化の有効性は依然として限定的です。
この論文では、私たちは、優秀な生徒を集合的に監督する単一のジェネラリスト教師ではなく、多様な専門教師を活用することで、この課題に対処することを提案します。
私たちのアプローチは、古典的な専門家の階層的混合に似ており、共同監督向けに調整された 2 つのコンポーネントを備えています。(i) 生徒のトレーニングと教師の割り当てを段階的に交互に行い、強い生徒の成長を活用して、妥当な監督を特定します。
(ii) 教師と生徒およびローカルとグローバルの一貫性を保守的に強化し、それらの依存関係を活用して潜在的な注釈ノイズを拒否します。
OpenAI の弱対強ベンチマークと追加のマルチドメインデータセットでの視覚認識タスクを通じて、提案された方法を検証します。
コードは \url{https://github.com/yuejiangliu/csl} で入手できます。

要約(オリジナル)

Steering the behavior of a strong model pre-trained on internet-scale data can be difficult due to the scarcity of competent supervisors. Recent studies reveal that, despite supervisory noises, a strong student model may surpass its weak teacher when fine-tuned on specific objectives. Yet, the effectiveness of such weak-to-strong generalization remains limited, especially in the presence of large capability gaps. In this paper, we propose to address this challenge by harnessing a diverse set of specialized teachers, instead of a single generalist one, that collectively supervises the strong student. Our approach resembles the classical hierarchical mixture of experts, with two components tailored for co-supervision: (i) we progressively alternate student training and teacher assignment, leveraging the growth of the strong student to identify plausible supervisions; (ii) we conservatively enforce teacher-student and local-global consistency, leveraging their dependencies to reject potential annotation noises. We validate the proposed method through visual recognition tasks on the OpenAI weak-to-strong benchmark and additional multi-domain datasets. Our code is available at \url{https://github.com/yuejiangliu/csl}.

arxiv情報

著者	Yuejiang Liu,Alexandre Alahi
発行日	2024-02-23 18:56:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー