Abstract :
[en] Learning from examples is a frequently arising challenge, with a
large number of algorithms proposed in the classification, data mining
and machine learning literature. The evaluation of the quality of such
algorithms is frequently carried out ex post, on an experimental basis:
their performance is measured either by cross validation on benchmark
data sets, or by clinical trials. Few of these approaches evaluate the
learning process ex ante, on its own merits. In this paper, we dis-
cuss a property of rule-based classifiers which we call "justifiability",
and which focuses on the type of information extracted from the given
training set in order to classify new observations. We investigate some
interesting mathematical properties of justifiable classifiers. In partic-
ular, we establish the existence of justifiable classifiers, and we show
that several well-known learning approaches, such as decision trees or
nearest neighbor based methods, automatically provide justifiable clas-
sifiers. We also identify maximal subsets of observations which must
be classified in the same way by every justifiable classifier. Finally, we
illustrate by a numerical example that using classifiers based on "most
justifiable" rules does not seem to lead to over fitting, even though it
involves an element of optimization.
Scopus citations®
without self-citations
33