The class imbalance is a critical problem in classification tasks related to many real world applications. A large number of solutions were proposed in literature, both at the algorithmic and data levels. In this paper we analyze the second kind of approach and, in particular, we focus our attention on the use of Multiple Classification Systems where each classifier is trained on a dataset containing the minority class and a subset of the majority class samples. The aim of this approach is to avoid the drawbacks of other methods, commonly used in this context, which force a balanced distribution by oversampling the minority class. We compare the results obtained applying different realizations of the method on the UCI Repository datasets.
MCS-based Balancing Techniques for Skewed Classes: an Empirical Comparison
RICAMATO, Maria Teresa;MARROCCO, Claudio;TORTORELLA, Francesco
2008-01-01
Abstract
The class imbalance is a critical problem in classification tasks related to many real world applications. A large number of solutions were proposed in literature, both at the algorithmic and data levels. In this paper we analyze the second kind of approach and, in particular, we focus our attention on the use of Multiple Classification Systems where each classifier is trained on a dataset containing the minority class and a subset of the majority class samples. The aim of this approach is to avoid the drawbacks of other methods, commonly used in this context, which force a balanced distribution by oversampling the minority class. We compare the results obtained applying different realizations of the method on the UCI Repository datasets.File | Dimensione | Formato | |
---|---|---|---|
ICPR 2008.unbalanced.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
DRM non definito
Dimensione
254.53 kB
Formato
Adobe PDF
|
254.53 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.