Loss-Curvature Matching for Dataset Selection and Condensation

要約

大規模なデータセットでニューラルネットワークをトレーニングするには、かなりの計算コストが必要です。
データセット削減は、完全なデータセットからの汎化パフォーマンスの低下を最小限に抑えながら、大規模なデータセットに基づいてデータインスタンスを選択または合成します。
既存の方法は、データセットの削減手順中にニューラルネットワークを利用するため、モデルパラメーターは、削減後のパフォーマンスを維持する上で重要な要素になります。
パラメータの重要性に応じて、この論文では、元のデータセットと縮小されたデータセットの損失曲率をパラメータポイント以上のモデルパラメータ空間で一致させる造語の LCMat という新しい縮小目標を導入します。
この新しい目的は、正確なポイントマッチングよりも、摂動パラメーター領域で縮小されたデータセットをより適切に適応させます。
特に、局所パラメータ領域から損失曲率ギャップの最悪のケースを特定し、そのような最悪のケースの実装可能な上限を理論的解析で導き出します。
コアセットの選択と凝縮のベンチマークの両方に関する私たちの実験は、LCMat が既存のベースラインよりも優れた一般化パフォーマンスを示すことを示しています。

要約(オリジナル)

Training neural networks on a large dataset requires substantial computational costs. Dataset reduction selects or synthesizes data instances based on the large dataset, while minimizing the degradation in generalization performance from the full dataset. Existing methods utilize the neural network during the dataset reduction procedure, so the model parameter becomes important factor in preserving the performance after reduction. By depending upon the importance of parameters, this paper introduces a new reduction objective, coined LCMat, which Matches the Loss Curvatures of the original dataset and reduced dataset over the model parameter space, more than the parameter point. This new objective induces a better adaptation of the reduced dataset on the perturbed parameter region than the exact point matching. Particularly, we identify the worst case of the loss curvature gap from the local parameter region, and we derive the implementable upper bound of such worst-case with theoretical analyses. Our experiments on both coreset selection and condensation benchmarks illustrate that LCMat shows better generalization performances than existing baselines.

arxiv情報

著者	Seungjae Shin,Heesun Bae,Donghyeok Shin,Weonyoung Joo,Il-Chul Moon
発行日	2023-03-08 08:59:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Loss-Curvature Matching for Dataset Selection and Condensation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー