Imputation using training labels and classification via label imputation

要約

データの欠落は、実際の設定ではよくある問題です。
欠損データに対処するために、さまざまな代入手法が開発されています。
ただし、ラベルは通常トレーニングデータで利用可能ですが、代入の一般的な実践では通常、入力のみに依存し、ラベルは無視されます。
この作業では、ラベルを入力にスタックすることで入力の補完がどのように大幅に改善されるかを示します。
さらに、予測されたテストラベルを欠損値で初期化し、そのラベルを代入用の入力とスタックする分類戦略を提案します。
これにより、ラベルと入力を同時に入力することができます。
また、この手法は事前代入なしでラベルが欠落しているデータトレーニングを処理でき、連続データ、カテゴリデータ、または混合タイプのデータに適用できます。
実験では、精度の点で有望な結果が示されています。

要約(オリジナル)

Missing data is a common problem in practical settings. Various imputation methods have been developed to deal with missing data. However, even though the label is usually available in the training data, the common practice of imputation usually only relies on the input and ignores the label. In this work, we illustrate how stacking the label into the input can significantly improve the imputation of the input. In addition, we propose a classification strategy that initializes the predicted test label with missing values and stacks the label with the input for imputation. This allows imputing the label and the input at the same time. Also, the technique is capable of handling data training with missing labels without any prior imputation and is applicable to continuous, categorical, or mixed-type data. Experiments show promising results in terms of accuracy.

arxiv情報

著者	Thu Nguyen,Pål Halvorsen,Michael A. Riegler
発行日	2023-11-28 15:26:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Imputation using training labels and classification via label imputation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー