Improving specific class mapping from remotely sensed data by cost-sensitive learning

[en] In many remote-sensing projects, one is usually interested in a small number of land-cover classes present in a study area and not in all the land-cover classes that make-up the landscape. Previous studies in supervised classification of satellite images have tackled specific class mapping problem by isolating the classes of interest and combining all other classes into one large class, usually called others, and by developing a binary classifier to discriminate the class of interest from the others. Here, this approach is called focused approach. The strength of the focused approach is to decompose the original multi-class supervised classification problem into a binary classification problem, focusing the process on the discrimination of the class of interest. Previous studies have shown that this method is able to discriminate more accurately the classes of interest when compared with the standard multi-class supervised approach. However, it may be susceptible to data imbalance problems present in the training data set, since the classes of interest are often a small part of the training set. A result the classification may be biased towards the largest classes and, thus, be sub-optimal for the discrimination of the classes of interest. This study presents a way to minimize the effects of data imbalance problems in specific class mapping using cost-sensitive learning. In this approach errors committed in theminority class are treated as being costlier than errors committed in the majority class. Cost-sensitive approaches are typically implemented by weighting training data points accordingly to their importance to the analysis. By changing the weight of individual data points, it is possible to shift theweight from the larger classes to the smaller ones, balancing the data set. To illustrate the use of the cost-sensitive approach to map specific classes of interest, a series of experiments with weighted support vector machines classifier and Landsat Thematic Mapper data were conducted to discriminate two types of mangrove forest (high-mangrove and low-mangrove) in Saloum estuary, Senegal, a United Nations Educational, Scientific and Cultural Organisation World Heritage site. Results suggest an increase in overall classification accuracy with the use of cost-sensitive method (97.3%) over the standard multi-class (94.3%) and the focused approach (91.0%). In particular, cost-sensitive method yielded higher sensitivity and specificity values on the discrimination of the classes of interest when compared with the standard multi-class and focused approaches.

Disciplines :

Engineering, computing & technology: Multidisciplinary, general & others

Author, co-author :

Silva, Joel; NOVA Information Management School, Universidade Nova de Lisboa, Lisboa, Portugal

Bacao, Fernando; NOVA Information Management School, Universidade Nova de Lisboa, Lisboa, Portugal

Dieng, Ndeye Maguette ; Université de Liège - ULiège Université de Dakar - UCAD > Geologie > Doct. sc. ingé. (architecture, génie civ. & géol.)

Foody, Giles; School of Geography, University of Nottingham, Nottingham, UK

Caetano, Mario; Direção Geral do Território, Lisboa, Portugal

Language :

English

Title :

Improving specific class mapping from remotely sensed data by cost-sensitive learning

Publication date :

January 2017

Journal title :

International Journal of Remote Sensing

ISSN :

0143-1161

eISSN :

1366-5901

Publisher :

Taylor & Francis, Abingdon, United Kingdom

Volume :

Issue :

Pages :

3294–3316

Peer reviewed :

Peer Reviewed verified by ORBi

Available on ORBi :

since 07 September 2017

Statistics

Number of views

60 (0 by ULiège)

Number of downloads

1 (0 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

See more details

publications

supporting

mentioning

contrasting

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Bibliography

Akbani, R., S. Kwek, and N. Japkowicz. 2004. Applying Support Vector Machines to Imbalanced Datasets. Proceedings of the 15th European Conference on Machine Learning (ECML), 39-50. Berlin: Springer.
Alcantara, C., T. Kuemmerle, A. V. Prishchepov, and V. C. Radeloff. 2012. “Mapping Abandoned Agriculture with Multi-Temporal MODIS Satellite Data.” Remote Sensing of Environment 124: 334-347. doi:10.1016/j.rse.2012.05.019.
Atkinson, P. M., G. M. Foody, P. W. Gething, A. Mathur, and C. K. Kelly. 2007. “Investigating Spatial Structure in Specific Tree Species in Ancient Semi-Natural Woodland Using Remote Sensing and Marked Point Pattern Analysis.” Ecography 30 (1): 88-104. doi:10.1111/ eco.2007.30.issue-1.
Baldeck, C. A., G. P. Asner, R. E. Martin, C. B. Anderson, D. E. Knapp, J. R. Kellner, and S. J. Wright. 2015. “Operational Tree Species Mapping in a Diverse Tropical Forest with Airborne Imaging Spectroscopy.” Plos ONE 10 (7): e0118403. doi:10.1371/journal.pone.0118403.
Bishop, C. M. 2006. Pattern Recognition and Machine Learning, Information Science and Statistics. Berlin: Springer.
Boyd, D., C. Sanchez-Hernandez, and G. Foody. 2006 March 2015. “Mapping a Specific Class for Priority Habitats Monitoring from Satellite Sensor Data.” International Journal of Remote Sensing 27: 2631-2644. doi:10.1080/01431160600554348.
Cao, P., D. Zhao, and O. Zaiane. 2013. An Optimized Cost-Sensitive SVM for Imbalanced Data Learning. Advances in Knowledge Discovery and Data Mining, 280-292. Berlin: Springer.
Chang, C. C., and C.-L. Lin. 2011. “Libsvm: A Library of Support Vector Machines.” ACM Transactions on Intelligent Systems and Technology 2: 1-27. doi:10.1145/1961189.1961199.
Chawla, N. V. 2005. “Data Mining for Imbalanced Datasets: An Overview.” In Data Mining and Knowledge Discovery Handbook, edited by M. Oded and R. Lior, 853-867. Boston, MA: Springer-US.
Cockx, K., T. van de Voorde, and F. Canters. 2014. “Quantifying Uncertainty in Remote Sensing-Based Urban Land-Use Mapping.” International Journal of Applied Earth Observation and Geoinformation 31 (1): 154-166. doi:10.1016/j.jag.2014.03.016.
Deng, N., Y. Tian, and C. Zhang. 2012. Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions. Boca Raton, Florida: CRC Press.
Dieng, M., J. Silva, M. Goncalves, S. Faye, and M. Caetano. 2014. The Land/Ocean Interactions in the Coastal Zone of West and Central Africa, Estuaries of the World. Estuaries of the World. New York City: Springer
Diop, E. S., 1986. “Estuaires holocènes tropicaux. etude géographique physique comparée des rivières du sud du saloum (sénégal) à la mellcorée (république de guinée).” Ph.D. thesis., Université Louis Pasteur, Strasbourg.
Du, S., and S. Chen, 2005. “Weighted Support Vector Machine for Classification.” Systems, Man and Cybernetics, 2005 IEEE 2, 859-864. Tarrytown, NY: Pergamon Press, Inc.
Faye, S., M. Diaw, R. Malou, and A. Faye. 2008.Impacts of Climate Change on Groundwater Recharge and Salinization of Groundwater Resources in Senegal. Groundwater and Climate in Africa Proceeding of the Kampala Conference. Wallingford, UK: IAHS Press
Feng, X., G. Foody, P. Aplin, and S. N. Gosling. 2015. “Enhancing the Spatial Resolution of Satellite-Derived Land Surface Temperature Mapping for Urban Areas.” Sustainable Cities and Society 19: 341-348. doi:10.1016/j.scs.2015.04.007.
Fernandez, A., V. Lopez, M. Galar, M. J. Del Jesus, and F. Herrera. 2013. “Analysing the Classification of Imbalanced Data-Sets with Multiple Classes: Binarization Techniques and Ad-Hoc Approaches.” Knowledge-Based Systems 42: 97-110. doi:10.1016/j.knosys.2013.01.018.
Fleiss, J. L., B. Levin, and M. C. Paik. 2003. “Statistical Methods for Rates and Proportions.” 3rd. Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley.
Foody, G. M. 2004. “Supervised Image Classification by MLP and RBF Neural Networks with and without an Exhaustively Defined Set of Classes.” International Journal of Remote Sensing 25 (15): 3091-3104. doi:10.1080/01431160310001648019.
Foody, G. M. 2009. “Classification Accuracy Comparison: Hypothesis Tests and the Use of Confidence Intervals in Evaluations of Difference, Equivalence and Non-Inferiority.” Remote Sensing of Environment 113 (8): 1658-1663. doi:10.1016/j.rse.2009.03.014.
Foody, G. M., P. M. Atkinson, P. W. Gething, N. A. Ravenhill, and C. K. Kelly. 2005. “Identification of Specific Tree Species in Ancient Semi-Natural Woodland from Digital Aerial Sensor Imagery.” Ecological Applications 15 (4): 1233-1244. doi:10.1890/04-1061.
Foody, G. M., D. S. Boyd, and C. Sanchez-Hernandez. 2007. “Mapping a Specific Class with an Ensemble of Classifiers.” International Journal of Remote Sensing 28 (8): 1733-1746. doi:10.1080/ 01431160600962566.
Foody, G. M., A. Mathur, C. Sanchez-Hernandez, and D. S. Boyd. 2006. “Training Set Size Requirements for the Classification of a Specific Class.” Remote Sensing of Environment 104 (1, sep): 1-14. doi:10.1016/j.rse.2006.03.004.
Galar, M., A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera. 2011. “An Overview of Ensemble Methods for Binary Classifiers in Multi-Class Problems: Experimental Study on One-Vs-One and One-Vs-All Schemes.” Pattern Recognition 44 (8): 1761-1776. doi:10.1016/j. patcog.2011.01.017.
Graves, S. J., G. P. Asner, R. E. Martin, C. B. Anderson, M. S. Colgan, L. Kalantari, and S. A. Bohlman. 2016. “Tree Species Abundance Predictions in a Tropical Agricultural Landscape with a Supervised Classification Model and Imbalanced Data.” Remote Sensing In Review 2: 1-21. doi:10.3390/rs8020161.
Hastie, T., R. Tibshinari, and J. Friedman. 2009. The Elements of Statistical Learning. second ed. Springer Series in Statistics, New York: Springer.
He, H., and E. A. Garcia. 2009. “Learning from Imbalanced Data.” IEEE Transactions on Knowledge and Data Engineering 21 (9): 1263-1284. doi:10.1109/TKDE.2008.239.
He, H., and M. Yunqian. 2013. Imbalanced Learning: Foundation, Algorithms and Applications, the Instit Edition. Hoboken, NJ: John Wiley Sons, Ltd.
Hsu, C.-W., and C.-J. Lin. 2002. “A Comparison of Methods for Multiclass Support Vector Machines.” IEEE Transactions on Neural Networks 13 (2): 415-425. doi:10.1109/72.991427.
Huang Yin-Min, D. S.-X., 2005. Weighted Support Vector Machine for Classification with Uneven Training Class Sizes. 2005 IEEE International Conference on Systems, Man and Cybernetics 4 (August), 3866-3871. Los Alamitos: IEEE press.
Hwang, J. P., S. Park, and E. Kim. 2011. “A New Weighted Approach to Imbalanced Data Classification Problem via Support Vector Machine with Quadratic Cost Function.” Expert Systems with Applications 38 (7): 8580-8585. doi:10.1016/j.eswa.2011.01.061.
Japkowiciz, N., and S. Stephen. 2002. “The Class Imbalance Problem: A Systematic Study.” Intelligent Data Analysis 6 (5): 1-39.
Kotsiantis, S., D. Kanellopoulos, and P. Pintelas. 2006. Handling Imbalanced Datasets: A Review. GESTS International Transactions on Computer Science and Engineering, Vol. 30.
Krawczyk, B. 2015. “One-Class Classifier Ensemble Pruning and Weighting with Firefly Algorithm.” Neurocomputing 150 (PB): 490-500. doi:10.1016/j.neucom.2014.07.068.
Krawczyk, B., M. Woźniak, and F. Herrera. 2015. “On the Usefulness of One-Class Classifier Ensembles for Decomposition of Multi-Class Problems.” Pattern Recognition 48 (12): 3969-3982. doi:10.1016/j.patcog.2015.06.001.
Kubat, M., and S. Matwin, 1997. Addressing the Curse of Imbalanced Training Sets: One Sided Selection. Proceedings of the Fourteenth International Conference on Machine Learning. Vol. 4. pp. 179-186. Massachusetts, US: Morgan Kaufmann.
Laba, M., R. Downs, S. Smith, S. Welsh, C. Neider, S. White, M. Richmond, W. Philpot, and P. Baveye. 2008. “Mapping Invasive Wetland Plants in the Hudson River National Estuarine Research Reserve Using Quickbird Satellite Imagery.” Remote Sensing of Environment 112 (1): 286-300. doi:10.1016/j.rse.2007.05.003.
Lark, R. M. 1995. “Components of Accuracy of Maps with Special Reference to Discriminant Analysis on Remote Sensor Data.” International Journal of Remote Sensing 16 (8): 1461-1480. doi:10.1080/01431169508954488.
Lee, T. M., and H. C. Yeh. 2009. “Applying Remote Sensing Techniques to Monitor Shifting Wetland Vegetation: A Case Study of Danshui River Estuary Mangrove Communities.” Taiwan. Ecological Engineering 35 (4): 487-496. doi:10.1016/j.ecoleng.2008.01.007.
Liu, S., C. Jia, and H. Ma. 2005. “A New Weighted Support Vector Machine with GA-Based Parameter Selection.” Machine Learning and Cybernetics 2005 (August): 18-21.
Lopez, V., A. Fernandez, J. G. Moreno-Torres, and F. Herrera. 2012. “Analysis of Preprocessing Vs. Cost-Sensitive Learning for Imbalanced Classification. Open Problems on Intrinsic Data Characteristics.” Expert Systems with Applications 39 (7): 6585-6608. doi:10.1016/j.eswa.2011.12.043.
Mack, B., R. Roscher, and B. Waske. 2014. “Can I Trust My One-Class Classification?.” Remote Sensing 6 (9): 8779-8802. doi:10.3390/rs6098779.
Mellor, A., S. Boukir, A. Haywood, and S. Jones. 2015. “Exploring Issues of Training Data Imbalance and Mislabelling on Random Forest Performance for Large Area Land Cover Classification Using the Ensemble Margin.” ISPRS Journal of Photogrammetry and Remote Sensing 105: 155-168. doi:10.1016/j.isprsjprs.2015.03.014.
Mitsch, W., and J. Gosselink. 2015. Wetlands. Hoboken, New Jersey: Wiley.
Mountrakis, G., J. Im, and C. Ogole. 2011. “Support Vector Machines in Remote Sensing: A Review.” ISPRS Journal of Photogrammetry and Remote Sensing 66 (3): 247-259. doi:10.1016/j. isprsjprs.2010.11.001.
Nguyen, G. H., S. L. Phung, and A. Bouzerdoum. 2010. “Efficient SVM Training with Reduced Weighted Samples.” Proceedings of the International Joint Conference on Neural Networks, Hong Kong, June 1-6, 2981-2987.
Qiao, X., and L. Zhang. 2013. Distance-Weighted Support Vector Machine. Statistics and Its Interface, 8 (3): 331-345.
Rahman, M. M., and D. N. Davis. 2014. “Transactions on Engineering Technologies: Special Volume of the World Congress on Engineering 2013.” In Semi Supervised Under-Sampling: A Solution to the Class Imbalance Problem for Classification and Feature Selection, edited by Y. Gi-Chul, A. Sio-Iong and G. Len, 611-625. Dordrecht, Ch: Springer Netherlands. doi:10.1007/978-94-017-8832-8_44
Rifkin, R., and A. Klautau. 2004. “In Defense of One-Vs-All Classification.” Journal of Machine Learning Research 5: 101-141.
Sanchez-Hernandez, C., D. S. Boyd, and G. M. Foody. 2007. “One-Class Classification for Mapping a Specific Land-Cover Class: SVDD Classification of Fenland.” IEEE Transactions on Geoscience and Remote Sensing 45 (4): 1061-1073. doi:10.1109/TGRS.2006.890414.
Schölkopf, B., A. J. Smola, R. C. Williamson, and P. L. Bartlett. 2000. “New Support Vector Algorithms.” Neural Computation 12 (5): 1207-1245. doi:10.1162/089976600300015565.
Shalev-Shwartz, S., and S. Ben-David. 2014. Understanding Machine Learning: From Theory to Algorithms. New York, NY, USA: Cambridge University Press.
Shawe-Taylor, J., and N. Cristianini. 2004. Kernel Methods for Pattern Analysis. New York, NY, USA: Cambridge University Press.
Sheeren, D., M. Fauvel, V. Josipovi, M. Lopes, and C. Planque. 2016. Tree Species Classification in Temperate Forests Using Formosat-2 Satellite Image Time Series, Remote Sensing 8 (9): 734. doi:10.3390/rs8090734
Song, C., C. E. Woodcock, K. C. Seto, M. P. Lenney, and S. A. Macomber. 2001. “Classification and Change Detection Using Landsat TM Data: When and How to Correct Atmospheric Effects?.” Remote Sensing of Environment 75 (2): 230-244. doi:10.1016/S0034-4257(00)00169-3.
Tang, Y., Y. Q. Zhang, and N. V. Chawla. 2009. “Svms Modeling for Highly Imbalanced Classification.” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 39 (1): 281-288. doi:10.1109/TSMCB.2008.2002909.
Tax, D. M. J., 2001. “One-class classification.” Ph.D. thesis., Delft University of Technology, The Netherlands.
Vo, T., C. Kuenzer, and N. Oppelt. 2015. “How Remote Sensing Supports Mangrove Ecosystem Service Valuation: A Case Study in Ca Mau Province, Vietnam.” Ecosystem Services 14 (MAY): 67-75. doi:10.1016/j.ecoser.2015.04.007.
Weiss, G. M. 2004. “Mining with Rarity: A Unifying Framework.” ACM SIGKDD Explorations Newsletter 6 (1): 7-19. doi:10.1145/1007730.
Weiss, G. M., and F. Provost. 2003. “Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction.” Journal of Artificial Intelligence Research 19: 315-354.
Xanthopoulos, P., and T. Razzaghi. 2014. “A Weighted Support Vector Machine Method for Control Chart Pattern Recognition.” Computers & Industrial Engineering 70 (October): 134-149. doi:10.1016/j.cie.2014.01.014.
Yang, X., Q. Song, and Y. Wang. 2007. “A Weighted Support Vector Machine for Data Classification.” International Journal of Pattern Recognition and Artificial Inteligence 2 (5): 859-864.
Zhang, S., S. Sadaoui, and M. Mouhoub. 2015. “An Empirical Analysis of Imbalanced Data Classification.” Computer and Information Science 8 (1): 151-162. doi:10.5539/cis.v8n1p151.

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website