Abstract :
[en] Researchers have extensively used machine learning techniques and data mining methods to build prediction models and classify data in various domains such as aviation, computer science, education, finance, marketing and particularly in medical field where those methods are applied as support systems for diagnosis and analysis in order to make better decisions. On this subject, our research paper attempts to assess the performance of Individual and Ensemble machine learning techniques based on the effectiveness and the efficiently, in terms of accuracy, specificity, sensitivity and precision to choose the most effective. The main object of our research paper is to define the best and effective machine learning approach for the Breast Cancer diagnosis and prediction. To achieve our objective, we applied individual based level machine learning algorithms Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Naïve Bayes (NB), Decision tree (C4.5), Simple Logistic and well known ensembles methods like Majority Voting and Random Forest with 10 cross field technique on the Breast Cancer Diagnosis Dataset obtained from UCI Repository. The experimental results show that the Majority Voting Ensemble technique based on 3 top classifiers SVM, K-NN, Simple Logistic gives the highest accuracy 98.1% with the lowest error rate 0.01% and outperformed all other individual classifiers. This study demonstrates that our proposal approach based on Majority Voting Ensemble technique was the best classification machine learning model with the highest level of accuracy for breast cancer prediction and diagnosis. All experiments are effectuated within a simulation environment and realized in Weka data mining tool.
Scopus citations®
without self-citations
23