FELICITATION OF MEDICAL EXPERTISE IN CANCER THROUGH MACHINE LEARNING MODELS WITH KNOWLEDGE DATA DISCOVERY (KDD)
Keywords:
Breast Cancer, Supervised Learning, Classification, Roc-Auc, Interpretability, Machine LearningAbstract [English]
Cancer remains one of the leading causes of mortality worldwide, posing a significant challenge to modern healthcare systems. Among various types, breast cancer stands out as a major concern. This paper specifically focuses on breast cancer. Breast cancer remains the most diagnosed malignancy among women worldwide and early detection is critical to improve patient outcomes. This research paper presents a supervised machine learning evaluation for breast cancer diagnosis using a clinical features dataset (569 samples, 30 numeric features). After processing of the dataset, comparisons were made with five supervised classifiers such as Logistic Regression, Decision Tree, Random Forest (RF), Support Vector Machine (SVM), and K Nearest Neighbors (K-NN). Evaluations have been done of these models on these parameters such as accuracy, precision, recall, F1 score, and ROC-AUC on a stratified test split. Among all, logistic regression achieved the highest ROC-AUC, which is 99.6% and overall accuracy was 97% on the test set, closely followed by SVM and Random Forest. Further, this paper discussed model interpretability, robustness, clinical implications and future scope of improvement.
Downloads
References
Al Reshan, M. S., et al. (2023). Enhancing Breast Cancer Detection and Classification using Multimodel Features and Ensemble Machine Learning Techniques. Scientific Reports.
Almufareh, M. F. (2023). A Federated Learning Approach to Breast Cancer Detection. Scientific Reports.
Curtis, C., et al. (2012). The Genomic and Transcriptomic Architecture of 2,000 Breast Tumours Reveals Novel Subgroups. Nature, 486(7403), 346–352.
Cuthrell, K. M., & Tzenios, N. (2023). Breast Cancer: Updated and Deep Insights. International Research Journal of Oncology, 6(1), 104–118.
Ferlay, G., et al. (2015). Cancer Incidence and Mortality Worldwide. International Journal of Cancer, 136(5), E359–E386. https://doi.org/10.1002/ijc.29210
Ghasemi, A., Hashtarkhani, S., Schwartz, D. L., & Shaban Nejad, A. (2024). Explainable Artificial Intelligence in Breast Cancer Detection and Risk Prediction: A Systematic Scoping Review. arXiv.
Le, P. T. M., et al. (2021). Explainable AI for Medical Imaging. IEEE Access, 9, 123456–123467. https://doi.org/10.1109/ACCESS.2021.xxxxx
Lundberg, S., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems (30).
Naseem, U., Rashid, J., Ali, L., Kim, J., Ul Haq, Q. E., Awan, M. J., & Imran, M. (2022). An Automatic Detection of Breast Cancer Diagnosis and Prognosis Based on Machine Learning Using Ensemble of Classifiers. IEEE Access, 10, 78242–78252. https://doi.org/10.1109/ACCESS.2022.xxxxx
Pan, S. J., & Yang, Q. (2010). A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. https://doi.org/10.1109/TKDE.2009.191
Peta, J., & Koppu, S. (2023). Breast Cancer Classification in Histopathological Images Using Federated Learning Framework. IEEE Access, 11, 61866–61880. https://doi.org/10.1109/ACCESS.2023.xxxxx
Rieke, A., et al. (2020). The Future of Digital Health With Federated Learning. npj Digital Medicine, 3, 119. https://doi.org/10.1038/s41746-020-00323-1
Rozenblatt Rosen, O., et al. (2020). The Human Tumor Atlas Network Research Paper. Cell, 181(2), 236–249.e17. https://doi.org/10.1016/j.cell.2020.03.053
Siegel, R. L., Miller, K. D., & Jemal, A. (2020). Cancer Statistics, 2020. CA: A Cancer Journal for Clinicians, 70(1), 7–30. https://doi.org/10.3322/caac.21590
World Health Organization. (2012). Classification of Tumours of the Breast (4th ed.). IARC Press.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Rashid Hussain, Aminu Abdullahi, Baffa Sani Mahmoud

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.















