Error curves for evaluating the quality of feature rankings

[en] In this article, we propose a method for evaluating feature ranking algorithms. A feature ranking algorithm estimates the importance of descriptive features when predicting the target variable, and the proposed method evaluates the correctness of these importance values by computing the error measures of two chains of predictive models. The models in the first chain are built on nested sets of top-ranked features, while the models in the other chain are built on nested sets of bottom ranked features. We investigate which predictive models are appropriate for building these chains, showing empirically that the proposed method gives meaningful results and can detect differences in feature ranking quality. This is first demonstrated on synthetic data, and then on several real-world classification benchmark problems.

Disciplines :

Computer science

Author, co-author :

Slavkov, Ivica; Jozef Stefan Institute, Ljubljana, Slovenia

Petkovic, Matej; Jozef Stefan Institute, Ljubljana, Slovenia

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Kocev, Dragi; Jozef Stefan Institute, Ljubljana, Slovenia

Dzeroski, Saso; Jozef Stefan Institute, Ljubljana, Slovenia

Language :

English

Title :

Error curves for evaluating the quality of feature rankings

Publication date :

07 December 2020

Journal title :

PeerJ Computer Science

eISSN :

2376-5992

Publisher :

PeerJ Inc., United States

Volume :

Issue :

e310

Pages :

Peer reviewed :

Peer Reviewed verified by ORBi

Additional URL :

https://peerj.com/articles/cs-310/

Available on ORBi :

since 28 March 2021

Statistics

Number of views

113 (3 by ULiège)

Number of downloads

56 (3 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

See more details

publications

supporting

mentioning

contrasting

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Bibliography

Aha D, Kibler D. 1991. Instance-based learning algorithms. Machine Learning 6:37–66.
Arceo-Vilas A, Fernandez-Lozano C, Pita S, Pértega-Díaz S, Pazos A. 2020. A redundancy-removing feature selection algorithm for nominal data. PeerJ Computer Science 1:e24 DOI 10.7717/peerj-cs.24.
Bakr GH, Hofmann T, Schölkopf B, Smola AJ, Taskar B, Vishwanathan SV. 2007. Predicting structured data. Cambridge: The MIT Press.
Biesiada J, Duch W, Kachel A, Paucha S. 2005. Feature ranking methods based on information entropy with parzen windows. In: International Conference on Research in Electrotechnology and Applied Informatics, Katowice, Poland.
Boucheham A, Batouche M. 2014. Robust biomarker discovery for cancer diagnosis based on meta-ensemble feature selection. In: 2014 Science and Information Conference. 452–560.
Breiman L. 2001. Random forests. Machine Learning 45(1):5–32 DOI 10.1023/A:1010933404324.
Cortes C, Vapnik V. 1995. Support-vector networks. Machine Learning 20(3):273–297.
Duch W, Wieczorek T, Biesiada J. 2004. Comparison of feature ranking methods based on information entropy. In: IEEE International Conference on Neural Networks - Conference Proceedings. Vol. 2. Piscataway: IEEE, 1415–1419.
Džeroski S, Demšar D, Grbović J. 2000. Predicting chemical parameters of river water quality from bioindicator data. Applied Intelligence 13(1):7–17 DOI 10.1023/A:1008323212047.
Džeroski S, Potamias G, Moustakis V, Charissis G. 1997. Automated revision of expert rules for treating acute abdominal pain in children. In: Artiﬁcial Intelligence in Medicine - AIME, LNCS 1211, 98–109.
Furlanello C, Seraﬁni M, Merler S, Jurman G. 2003. Entropy-based gene ranking without selection bias for the predictive classiﬁcation of microarray data. BMC Bioinformatics 4(1):54 DOI 10.1186/1471-2105-4-54.
Guyon I, Weston J, Barnhill S, Vapnik V. 2002. Gene selection for cancer classiﬁcation using support vector machines. Machine Learning 46(1/3):389–422 DOI 10.1023/A:1012487302797.
Guzmán-Martnez R, Alaiz-Rodrguez R. 2011. Feature selection stability assessment based on the Jensen-Shannon divergence. Lecture Notes in Computer Science 6911:597–612.
Henzgen S, Hüllermeier E. 2015. Weighted rank correlation: a ﬂexible approach based on fuzzy order relations. In: Appice A, Rodrigues PP, Santos Costa V, Gama J, Jorge A, Soares C, eds. Machine Learning and Knowledge Discovery in Databases. Berlin: Springer International Publishing, 422–437.
John GH, Langley P. 1995. Estimating continuous distributions in Bayesian classiﬁers. In: Proc. Eleventh Conference on Uncertainty in Artiﬁcial Intelligence, San Mateo, CA. Burlington: Morgan Kaufmann, 338–345.
Jong K, Mary J, Cornuéjols A, Marchiori E, Sebag M. 2004. Ensemble feature ranking. In: PKDD - LNCS 2302. 267–278.
Jurman G, Merler S, Barla A, Paoli S, Galea A, Furlanello C. 2008. Algebraic stability indicators for ranked lists in molecular proﬁling. Bioinformatics 24(2):258–264 DOI 10.1093/bioinformatics/btm550.
Kalousis A, Prados J, Hilario M. 2007. Stability of feature selection algorithms: a study on high-dimensional spaces. Knowledge and Information Systems 12(1):95–116 DOI 10.1007/s10115-006-0040-8.
Khoshgoftaar TM, Fazelpour A, Wang H, Wald R. 2013. A survey of stability analysis of feature subset selection techniques. In: IEEE 14th International Conference on Information Reuse Integration (IRI). Piscataway: IEEE, 424–431.
Lance GN, Williams WT. 1966. Computer programs for hierarchical polythetic classiﬁcation (‘similarity analyses’). Computer Journal 9(1):60–64 DOI 10.1093/comjnl/9.1.60.
Lance GN, Williams WT. 1967. Mixed-data classiﬁcatory programs i-agglomerative systems. Australian Computer Journal 1:15–20.
Li Z, Gu W. 2015. A redundancy-removing feature selection algorithm for nominal data. PeerJ Computer Science 3:e1184 DOI 10.7287/peerj.preprints.1184v1.
Liang J, Yang S, Winstanley A. 2008. Invariant optimal feature selection: a distance discriminant and feature ranking based solution. Pattern Recognition 41(5):1429–1439 DOI 10.1016/j.patcog.2007.10.018.
Liu T, Liu S, Chen Z, Ma W-Y. 2003. An evaluation on feature selection for text clustering. In: Fawcett T, Mishra N, eds. ICML. Menlo Park: The AAAI Press, 488–495.
Mramor M, Leban G, Demšar J, Zupan B. 2007. Visualization-based cancer microarray data classiﬁcation analysis. Bioinformatics 23(16):2147–2154 DOI 10.1093/bioinformatics/btm312.
Muja M, Lowe DG. 2009. Fast approximate nearest neighbors with automatic algorithm conﬁguration. In: Ranchordas A, Araújo H, eds. VISAPP (1). Porto: INSTICC Press, 331–340.
Nardone D, Ciaramella A, Staiano A. 2019. A redundancy-removing feature selection algorithm for nominal data. PeerJ Computer Science 1:e24 DOI 10.7717/peerj-cs.24.
Newman CBD, Merz C. 1998. UCI repository of machine learning databases. Available at http://archive.ics.uci.edu/ml/index.php (accessed 13 December 2015).
Nilsson R, Peña JM, Björkegren J, Tegnér J. 2007. Consistent feature selection for pattern recognition in polynomial time. Journal of Machine Learning Research 8:589–612.
Nogueira S, Sechidis K, Brown G. 2017. On the stability of feature selection algorithms. Journal of Machine Learning Research 18(1):6345–6398.
Paoli S, Jurman G, Albanese D, Merler S, Furlanello C. 2005. Semisupervised proﬁling of gene expressions and clinical data. In: Proc. Sixth International Conference on Fuzzy Logic and Applications. 284–289.
Quinlan R. 1993. C4.5: programs for machine learning. San Mateo: Morgan Kaufmann Publishers.
Robnik-Šikonja M, Kononenko I. 2003. Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning 53(1/2):23–69 DOI 10.1023/A:1025667309714.
Saeys Y, Abeel T, De Peer YV. 2008. Robust feature selection using ensemble feature selection techniques. In: Daelemans W, Goethals B, Morik K, eds. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008, Lecture Notes in Computer Science. Vol. 5212. Berlin: Springer, 313–325.
Slavkov I, Petković M, Kocev D, Džeroski S. 2018. Quantitative score for assessing the quality of feature rankings. Informatica 42(1):43–52.
Tsang IW, Kwok JT, Cheung P-M. 2005. Core vector machines: fast svm training on very large data sets. Journal of Machine Learning Research 6:363–392.
Verikas A, Gelzinis A, Bacauskiene M. 2011. Mining data with random forests: a survey and results of new tests. Pattern Recognition 44(2):330–349 DOI 10.1016/j.patcog.2010.08.011.
Wang Y, Jha S, Chaudhuri K. 2018. Analyzing the robustness of nearest neighbors to adversarial examples. In: Proceedings of the 35th International Conference on Machine Learning (ICML), PMLR 80. Stockholm, Sweden, 5120–5129.
Xu H, Caramanis C, Mannor S. 2009. Robustness and regularization of support vector machines. Journal of Machine Learning Research 10:1485–1510.

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website