Many complex classification tasks involve a discrimination between two classes. Since in such cases a classification error could frequently have serious consequences, the classifiers employed should ensure a very high reliability to avoid erroneous decisions. Unfortunately this is difficult to obtain in real situations where the classifier can meet samples very different from those examined in the training phase. Moreover, the cost for a wrong classification can be so high that it is convenient to reject the sample which gives raise to an unreliable result. However, despite its relevance, a reject option specifically devised for dichotomizers (i.e. two-class classifiers) has not been yet proposed. This paper presents a novel reject rule for dichotomizers, based on the Receiver Operating Characteristic curve. The rule minimizes the expected classification cost, defined on the basis of classification and error costs peculiar for the application at hand. Experiments performed with different classifier architectures on several data sets publicly available confirmed the effectiveness of the proposed reject rule.
A ROC-based Reject Rule for Dichotomizers
TORTORELLA, Francesco
2005-01-01
Abstract
Many complex classification tasks involve a discrimination between two classes. Since in such cases a classification error could frequently have serious consequences, the classifiers employed should ensure a very high reliability to avoid erroneous decisions. Unfortunately this is difficult to obtain in real situations where the classifier can meet samples very different from those examined in the training phase. Moreover, the cost for a wrong classification can be so high that it is convenient to reject the sample which gives raise to an unreliable result. However, despite its relevance, a reject option specifically devised for dichotomizers (i.e. two-class classifiers) has not been yet proposed. This paper presents a novel reject rule for dichotomizers, based on the Receiver Operating Characteristic curve. The rule minimizes the expected classification cost, defined on the basis of classification and error costs peculiar for the application at hand. Experiments performed with different classifier architectures on several data sets publicly available confirmed the effectiveness of the proposed reject rule.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.