PREDICTION OF DIABETES SCREENING BY USING DATA MINING ALGORITHMS

  • Aberham Tadesse Zemedkun Rift Valley University School of Post Graduate Studies Department of Computer Science P.O.Box 80734 Addis Ababa, Ethiopia https://orcid.org/0000-0001-7706-748X
Keywords: Diabetes, Data Mining, ML, J48, PART, JRIP, Naïve Bayes

Abstract

Diabetes is one of the most common non-communicable diseases in the world. Diabetes affects the ability to produce the hormone insulin. Thus, complications may occur if diabetes remains untreated and unidentified. That features a significant contribution to increased morbidity, mortality, and admission rates of patients in both developed and developing countries. When disease is not detected early, it leads to complications. Medical records of the cases were retrospective. Anthropometric and biochemical information was collected. From this data, four ML classification algorithms, including Decision Tree (J48), Naive-Bayes, PART rule induction, and JRIP, were used to prognosticate diabetes. Precision, recall, F-Measure, Receiver Operating Characteristics (ROC) scores, and the confusion matrix were calculated to determine the performance of the various algorithms. The performance was also measured by sensitivity and specificity. They have high classification accuracy and are generally comparable in predicting diabetes and free diabetes patients. Among the selected algorithms tested, the Decision Tree Classifier (J48) algorithm scored the highest accuracy and was the best predictor, with a classification accuracy of 92.74%.

Downloads

Download data is not yet available.

References

A. G. Eapen, (2004) "Application of Data mining in Medical Applications by," Univ. Waterloo, Retrieved from https://uwspace.uwaterloo.ca/handle/10012/772

A. Iyer, J. S, and R. Sumbaly, (2015) "Diagnosis of Diabetes Using Classification Mining Techniques," Int. J. Data Min. Knowl. Manag. Process, vol. 5, no. 1, pp. 01-14, Retrieved from https://doi.org/10.5121/ijdkp.2015.5101 DOI: https://doi.org/10.5121/ijdkp.2015.5101

A. SELAM, (2012) "PREDICTING THE OCCURRENCE OF MEASLES OUTBREAK IN ETHIOPIA USING DATA MINING TECHNOLOGY." Addis Ababa University,

A. Tella, (2015) "Electronic and paper based data collection methods in library and information science research: A comparative analyses," New Libr. World, vol. 116, no. 9-10, pp. 588-609, Retrieved from https://doi.org/10.1108/NLW-12-2014-0138 DOI: https://doi.org/10.1108/NLW-12-2014-0138

B. Dagnew et al., (2021) "Hypertriglyceridemia and Other Plasma Lipid Profile Abnormalities among People Living with Diabetes Mellitus in Ethiopia: A Systematic Review and Meta-Analysis," Biomed Res. Int., vol. 2021, Retrieved from https://doi.org/10.1155/2021/7389076 DOI: https://doi.org/10.1155/2021/7389076

B. S. Kumar and D. G. R., (2016) "A Survey on Data Mining Approaches to Diabetes Disease Diagnosis and Prognosis," Ijarcce, vol. 5, no. 12, pp. 463-467, Retrieved from https://doi.org/10.17148/IJARCCE.2016.512105 DOI: https://doi.org/10.17148/IJARCCE.2016.512105

B. Zerihun, (2017) "Developing a Predictive Model for Pre-Diabetes Screening by Using Data Mining Technology." Addis Ababa University,

D. Kabakchieva, (2016) "Predicting Student Performance by Using Data Mining Methods for Classification Predicting Student Performance by Using Data Mining Methods for Classification Dorina Kabakchieva," no. March 2013, Retrieved from https://doi.org/10.2478/cait-2013-0006 DOI: https://doi.org/10.2478/cait-2013-0006

H. Hauner and W. A. Scherbaum, (2002) "Type 2 diabetes," DMW - Dtsch. Medizinische Wochenschrift, vol. 127, no. 19, pp. 1003-1005, Retrieved from https://doi.org/10.1055/s-2002-28326 DOI: https://doi.org/10.1055/s-2002-28326

H. Yan, Y. Jiang, J. Zheng, C. Peng, and Q. Li, (2006) "A multilayer perceptron-based medical decision support system for heart disease diagnosis," Expert Syst. Appl., vol. 30, no. 2, pp. 272-281, Retrieved from https://doi.org/10.1016/j.eswa.2005.07.022 DOI: https://doi.org/10.1016/j.eswa.2005.07.022

I. M. Ahmed, A. M. Mahmoud, M. Aref, and A.-B. M. Salem, (2012) "A study on expert systems for diabetic diagnosis and treatment," Recent Adv. Inf. Sci., pp. 363-367,

J. James and K. Sarvanakumar, (2017) "Empirical Study on Data Mining Algorithms related to Breast Cancer," Indusedu.Org, vol. 07, no. 03, pp. 14-18,, [Online]. Available Retrieved from : http://www.indusedu.org/pdfs/IJRIME/IJRIME_1088_90543.pdf

J. M. Dowling and C.-F. Yap, (2014) "Communicable Diseases in Developing Countries," Commun. Dis. Dev. Ctries., 2014. Retrieved from https://doi.org/10.1057/9781137354785 DOI: https://doi.org/10.1057/9781137354785

J. Yu, H. Huang, and S. Tian, (2004) "Cluster validity and stability of clustering algorithms," Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 3138, no. 3, pp. 957-965, Retrieved from https://doi.org/10.1007/978-3-540-27868-9_105 DOI: https://doi.org/10.1007/978-3-540-27868-9_105

K. Eyasu, W. Jimma, and T. Tadesse, (2020) "Developing a Prototype Knowledge-Based System for Diagnosis and Treatment of Diabetes Using Data Mining Techniques," Ethiop. J. Health Sci., vol. 30, no. 1, pp. 115-124, Retrieved from https://doi.org/10.4314/ejhs.v30i1.15 DOI: https://doi.org/10.4314/ejhs.v30i1.15

O. Region, (2017) "Research in Molecular Medicine Prevalence of Prediabetes and its Risk Factors among the Employees of Ambo," vol. 5, no. 3, pp. 11-20, Retrieved from https://doi.org/10.29252/rmm.5.3.11 DOI: https://doi.org/10.29252/rmm.5.3.11

R. Williams et al., (2020) "Global and regional estimates and projections of diabetes-related health expenditure: Results from the International Diabetes Federation Diabetes Atlas, 9th edition," Diabetes Res. Clin. Pract., vol. 162, Retrieved from https://doi.org/10.1016/j.diabres.2020.108072 DOI: https://doi.org/10.1016/j.diabres.2020.108072

S. Anagaw, (2002) "Application of data mining technology to predict child mortality patterns : the case of butajira rural health project (brhp)," Unpubl. Masters thesis Addiss Ababa Univ.,.

S. Habibi, M. Ahmadi, and S. Alizadeh, (2015) "Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining," Glob. J. Health Sci., vol. 7, no. 5, pp. 304-310, Retrieved from https://doi.org/10.5539/gjhs.v7n5p304 DOI: https://doi.org/10.5539/gjhs.v7n5p304

W. Gao and Q. Qiao, (2012) "Screening for type 2 diabetes," Epidemiol. Type 2 Diabetes, pp. 29-38, Retrieved from https://doi.org/10.2174/978160805361211201010029 DOI: https://doi.org/10.2174/978160805361211201010029

Z. Marzuki and F. Ahmad, (2007) "Data Mining Discretization Methods and Performances," Mach. Learn., no. 1, pp. 978-980, Retrieved from https://d1wqtxts1xzle7.cloudfront.net/50217711/Data_Mining_Discretization_Methods_and_P20161109-21049-ukdace-with-cover-page-v2.pdf?Expires=1640247769&Signature=aBcWHXg6eVqFLq6aaQIxKpqA4KuDOdOhq7Nifd2cwm9wtkdzUHvlfkD6eiW4pllyKw0cPci26sAMcHgSU57tGBn9HeS4nqR6WsQCKUN-8w4OoreY-1Pjq1ecaCSZrh-1HLt0V0lapzSmtmWGZzP9gYJqfejBAvchirFY-3FH1F4TPbbgT7xyCA5HNSbUJFiOyAtUvjV-fzf~VhFAK3yREd9nwbhqc0-tHLL9aPQ2MIV-btIn6jYi0BIOlgGLT~b7XWM0NlotydSBaDP~l7CfKGJFl3UWZhUCp96wFIS5gla~kudQL12Rz0n2poR0XuaeLFVZ-hS4kQz5dwr1ODffOw__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA

Published
2021-12-24
How to Cite
Zemedkun, A. T. (2021). PREDICTION OF DIABETES SCREENING BY USING DATA MINING ALGORITHMS. International Journal of Engineering Science Technologies, 5(6), 87-101. https://doi.org/10.29121/ijoest.v5.i6.2021.253