Bayesian post-hoc regularization of random forests

要約

ランダムフォレストは、さまざまな機械学習タスクで広く使用されている強力なアンサンブル学習アルゴリズムです。
ただし、ノイズの多い特徴や無関係な特徴を過剰適合する傾向があり、汎化パフォーマンスが低下する可能性があります。
ポストホック正則化手法は、学習後に学習されたアンサンブルの構造を変更することで、この問題を軽減することを目的としています。
ここでは、ルートに近いリーフノードによってキャプチャされた信頼性の高いパターンを活用しながら、ツリーのより深いところにあるより特定的で潜在的にノイズの多いリーフノードの影響を潜在的に軽減するベイジアンポストホック正則化を提案します。
このアプローチでは、ツリーの一般的な構造を変更せず、ルートノードへの近さに基づいてリーフノードの影響を調整する形式の剪定が可能になります。
私たちは、さまざまな機械学習データセットに対するメソッドのパフォーマンスを評価しました。
私たちのアプローチは、最先端の手法と競合するパフォーマンスを実証しており、場合によっては、予測精度と一般化の点で最先端の手法を上回っています。

要約(オリジナル)

Random Forests are powerful ensemble learning algorithms widely used in various machine learning tasks. However, they have a tendency to overfit noisy or irrelevant features, which can result in decreased generalization performance. Post-hoc regularization techniques aim to mitigate this issue by modifying the structure of the learned ensemble after its training. Here, we propose Bayesian post-hoc regularization to leverage the reliable patterns captured by leaf nodes closer to the root, while potentially reducing the impact of more specific and potentially noisy leaf nodes deeper in the tree. This approach allows for a form of pruning that does not alter the general structure of the trees but rather adjusts the influence of leaf nodes based on their proximity to the root node. We have evaluated the performance of our method on various machine learning data sets. Our approach demonstrates competitive performance with the state-of-the-art methods and, in certain cases, surpasses them in terms of predictive accuracy and generalization.

arxiv情報

著者	Bastian Pfeifer
発行日	2023-06-06 14:15:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Bayesian post-hoc regularization of random forests

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー