Double-Stage Feature-Level Clustering-Based Mixture of Experts Framework

要約

Experts（MOE）の混合モデルは、深い学習（DL）に成功しました。
ただし、画像分類における密なモデルよりも複雑なアーキテクチャと利点は不明のままです。
以前の研究では、MOEのパフォーマンスは、多くの場合、入力空間の騒音や外れ値の影響を受けてきました。
一部のアプローチには、MOEモデルのトレーニング用の入力クラスタリングが組み込まれていますが、ほとんどのクラスタリングアルゴリズムにはラベル付きデータへのアクセスがなく、有効性が制限されています。
このペーパーでは、入力特徴抽出、機能レベルのクラスタリング、計算効率的な擬似ラベル化戦略で構成される、専門家（DFCP-MOE）フレームワークのダブルステージレベルのクラスタリングと擬似ラベルベースの混合物を紹介します。
このアプローチは、ノイズと外れ値の影響を減らし、ラベル付きデータの小さなサブセットを活用して、無効な入力の大部分をラベル付けします。
よく描かれたクラスター化された入力でMOEモデルをトレーニングすることにより、専門家の専門化を改善する条件付きエンドツーエンドの共同トレーニング方法を提案します。
従来のMOEや密集したモデルとは異なり、DFCP-MOEフレームワークは、入力スペースの多様性を効果的にキャプチャし、競争力のある推論結果につながります。
マルチクラス分類タスクの3つのベンチマークデータセットでアプローチを検証します。

要約(オリジナル)

The Mixture-of-Experts (MoE) model has succeeded in deep learning (DL). However, its complex architecture and advantages over dense models in image classification remain unclear. In previous studies, MoE performance has often been affected by noise and outliers in the input space. Some approaches incorporate input clustering for training MoE models, but most clustering algorithms lack access to labeled data, limiting their effectiveness. This paper introduces the Double-stage Feature-level Clustering and Pseudo-labeling-based Mixture of Experts (DFCP-MoE) framework, which consists of input feature extraction, feature-level clustering, and a computationally efficient pseudo-labeling strategy. This approach reduces the impact of noise and outliers while leveraging a small subset of labeled data to label a large portion of unlabeled inputs. We propose a conditional end-to-end joint training method that improves expert specialization by training the MoE model on well-labeled, clustered inputs. Unlike traditional MoE and dense models, the DFCP-MoE framework effectively captures input space diversity, leading to competitive inference results. We validate our approach on three benchmark datasets for multi-class classification tasks.

arxiv情報

著者	Bakary Badjie,José Cecílio,António Casimiro
発行日	2025-03-12 16:13:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Double-Stage Feature-Level Clustering-Based Mixture of Experts Framework

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー