Model Merging is Secretly Certifiable: Non-Vacuous Generalisation Bounds for Low-Shot Learning

要約

ディープネットワークのIID一般化能力の認証は、医学からセキュリティへのハイステークスアプリケーションでAIを信頼するための多くの要件の最初のものです。
ただし、深いネットワークの一般化境界をインスタンス化する場合、特にこのような高品質分野で一般的な小規模データに現代の大規模モデルを適用する場合、非vacuous保証を取得することは困難なままです。
このホワイトペーパーでは、モデルの融合証明書と一般化証明書に基づいた学習方法の家族との新しいつながりを描き、驚くべきことに、マイナーな調整により、いくつかの既存の学習戦略がすでに自明でない一般化保証を提供していることを示しています。
基本的に、微調整ではなく融合による下流タスクのデータ駆動型学習に焦点を当てることにより、認定された一般化ギャップはベースネットワークサイズとは独立しており、認証を促進します。
我々の結果は、VIT-Bなどの視覚モデルやMistral-7Bなどの言語モデルを使用しながら、100の例とともに低い例で学習するための非重要な一般化保証を初めて示しています。
この観察は、既存のシステムの認証を信頼できるものとして促進することに即座に意味を持ち、実践と理論の交差点で研究のための新しい方向性を開くため、重要です。

要約(オリジナル)

Certifying the IID generalisation ability of deep networks is the first of many requirements for trusting AI in high-stakes applications from medicine to security. However, when instantiating generalisation bounds for deep networks it remains challenging to obtain non-vacuous guarantees, especially when applying contemporary large models on the small scale data prevalent in such high-stakes fields. In this paper, we draw a novel connection between a family of learning methods based on model fusion and generalisation certificates, and surprisingly show that with minor adjustment several existing learning strategies already provide non-trivial generalisation guarantees. Essentially, by focusing on data-driven learning of downstream tasks by fusion rather than fine-tuning, the certified generalisation gap becomes tiny and independent of the base network size, facilitating its certification. Our results show for the first time non-trivial generalisation guarantees for learning with as low as 100 examples, while using vision models such as VIT-B and language models such as mistral-7B. This observation is significant as it has immediate implications for facilitating the certification of existing systems as trustworthy, and opens up new directions for research at the intersection of practice and theory.

arxiv情報

著者	Taehoon Kim,Henry Gouk,Minyoung Kim,Timothy Hospedales
発行日	2025-05-21 17:51:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Model Merging is Secretly Certifiable: Non-Vacuous Generalisation Bounds for Low-Shot Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー