Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157-1182 (2003)
Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioin-formatics. Bioinformatics 23(19), 2507-2517 (2007)
Alter, O., Brown, P., Botstein, D.: Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences 97, 10101-10106 (2000)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273-324 (1997)
Ghosh, D., Chinnaiyan, A.M.: Classification and selection of biomarkers in genomic data using lasso. J. Biomed. Biotechnol. 2, 147-154 (2005)
Bell, D.A., Wang, H.: A formalism for relevance and its application in feature subset selection. Machine Learning 41(2), 175-195 (2000)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5, 1205-1224 (2004)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence (2005)
Meyer, P.E., Bontempi, G.: On the use of variable complementarity for feature selection in cancer classification. In: Rothlauf, F., Branke, J., Cagnoni, S., Costa, E., Cotta, C., Drechsler, R., Lutton, E., Machado, P., Moore, J.H., Romero, J., Smith, G.D., Squillero, G., Takagi, H. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 91-102. Springer, Heidelberg (2006)
Wolpert, D.H., Kohavi, R.: Bias plus variance decomposition for zero-one loss functions. In: Prooceedings of the 13th International Conference on Machine Learning, pp. 275-283 (1996)
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, Heidelberg (1996)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, Heidelberg (2001)
Cover., T.M.: Learning in Pattern Recognition. In: Methodologies of Pattern Recognition. Academic Press, London (1969)
Fukunaga, K., Kessel, D.L.: Nonparametric bayes error estimation using unclassified samples. IEEE Transactions on Information Theory 19(4), 434-440 (1973)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, Chich-ester (1976)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley, New York (1990)
Li, T., Zhang, C., Ogihara, M.: A comparative study of feature selection and mul-ticlass classification methods for tissue classification based on gene expression. Bioinformatics 20(15), 2429-2437 (2004)
Dudoit, S., van der Laan, M.J.: Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Statistical Methodology 2(2), 131-154 (2005)
Tresp, V., Taniguchi, M.: Combining estimators using non-constant weighting functions. In: NIPS. MIT Press, Cambridge (1995)
Birattari, M., Bontempi, G., Bersini, H.: Lazy learning meets the recursive least-squares algorithm. In: Kearns, M.S., Solla, S.A., Cohn, D.A. (eds.) NIPS 11, pp. 375-381. MIT Press, Cambridge (1999)
Ambroise, C., McLachlan, G.: Selection bias in gene extraction on the basis of microarray gene-expression data. PNAS 99, 6562-6566 (2002)