GRADIENT-BOOSTED CAUSAL INFERENCE FRAMEWORK FOR POLICY RECOMMENDATION IN SMART GOVERNANCE SYSTEMS

Prerana Nilesh Khairnar

doi:10.29121/shodhkosh.v4.i2.2023.5543

Authors

Dr. Prerana Nilesh Khairnar Assistant Professor, Department of Computer Engineering, Sir Visvesvaraya Institute of Technology, Chincholi, Nashik, Maharashtra

DOI:

https://doi.org/10.29121/shodhkosh.v4.i2.2023.5543

Keywords:

Causal Inference, X-Learner, Xgboost, Policy Recommendation, Smart Governance, Econml, Treatment Effect Estimation

Abstract [English]

This study presents the Gradient-Boosted Causal Inference Framework that will contribute to effective data-driven policy decision making in the smart governance systems. The framework combines the X-Learner algorithm with XGBoost and allows you to accurately estimate individual treatment effects (ITE) in mixed and complex data scenarios. The model relying on the usage of the EconML library successfully integrated causal inference with advanced machine learning methods, improving the predictive power and explains the causal inference. Using simulated datasets of governance, the framework has proven major advances in estimation of policy values and treatment effect than conventional models would have. Using SHAP-based analysis also increases transparency giving policymakers the ability to view feature influence and decision pathways. The given proposed system is rather robust in incorporating imbalanced treatment groups and non-linear effects of policies, which can be considered to provide scalable solutions to governance facilitation in a number of domains. The findings point to the horizon that the approach can facilitate individualized, effective and evidence-based interventions in smart city settings and encourage more responsive and responsible societal decision making.

References

Moore F. S., "Surgical streams in the flow of Health Care Financing," Annals of Surgery, vol. 201, pp. 132–141, 1985. DOI: https://doi.org/10.1097/00000658-198502000-00002

Macario A., "Are your hospital operating rooms “efficient”?," Anesthesiology, vol. 105, pp. 237–240, 2006. DOI: https://doi.org/10.1097/00000542-200608000-00004

Bartek M. A., Saxena R. C., Solomon S., Fong C. T., Behara L. D., Venigandla R., et al. "Improving operating room efficiency: Machine learning approach to predict case-time duration," Journal of the American College of Surgeons, vol. 229, 2019. https://doi.org/10.1016/j.jamcollsurg.2019.05.029 PMID: 31310851 DOI: https://doi.org/10.1016/j.jamcollsurg.2019.05.029

Childers C. P. and Maggard-Gibbons M., "Understanding costs of care in the operating room," JAMA Surgery, vol. 153, 2018. https://doi.org/10.1001/jamasurg.2017.6233 PMID: 29490366 DOI: https://doi.org/10.1001/jamasurg.2017.6233

Sivia D. S. and Pandit J. J., "Mathematical model of the risk of drug error during anaesthesia: The influence of drug choices, injection routes, operation duration and fatigue," Anaesthesia, vol. 74, pp. 992– 1000, 2019. https://doi.org/10.1111/anae.14629 PMID: 30883682 DOI: https://doi.org/10.1111/anae.14629

Z. Zhou, D. Miller, N. Master, D. Scheinker, N. Bambos and P. Glynn, "Detecting inaccurate predictions of pediatric surgical durations," 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2016. DOI: https://doi.org/10.1109/DSAA.2016.56

Lin Y.-K. and Chou Y.-Y., "A hybrid genetic algorithm for operating room scheduling," Health Care Management Science, vol. 23, pp. 249–263, 2019. https://doi.org/10.1007/s10729-019-09481-5 PMID: 30919231 DOI: https://doi.org/10.1007/s10729-019-09481-5

Sufahani S. and Ismail Z., "A real scheduling problem for hospital operation room," Applied Mathematical Sciences, vol. 8, pp. 5681–5688, 2014. DOI: https://doi.org/10.12988/ams.2014.46413

Ben Tayeb D., Lahrichi N. and Rousseau L.-M., "Patient scheduling based on a service-time prediction model: A data-driven study for a radiotherapy center," Springer Health Care Management Science, 2018. DOI: https://doi.org/10.1007/s10729-018-9459-1

Le N. Q., Kha Q. H., Nguyen V. H., Chen Y.-C., Cheng S.-J. and Chen C.-Y., "Machine Learning-Based Radiomics Signatures for EGFR and KRAS Mutations Prediction in Non-Small-Cell Lung Cancer," International Journal of Molecular Sciences, vol. 22, p. 9254, 2021. https://doi.org/10.3390/ijms22179254 PMID: 34502160 DOI: https://doi.org/10.3390/ijms22179254

Ahmed M. U., Barua S. and Begum S., "Artificial Intelligence, Machine Learning and Reasoning in Health Informatics—Case Studies," Artificial Intelligence, Machine Learning and Reasoning in Health Informatics—Case Studies, pp. 261–291, 2020. DOI: https://doi.org/10.1007/978-3-030-54932-9_12

S. Tonekaboni, S. Joshi, M. D. McCradden and A. Goldenberg, "What Clinicians Want: Contextualizing Explainable Machine Learning for Clinical End Use," in Proceedings of the 4th Machine Learning for Healthcare Conference, 2019.

Rodrı´guez-Pe´rez R. and Bajorath J., "Interpretation of compound activity predictions from complex machine learning models using local approximations and Shapley values," Journal of Medicinal Chemistry, vol. 63, pp. 8761–8777, 2019. https://doi.org/10.1021/acs.jmedchem.9b01101 PMID: 31512867 DOI: https://doi.org/10.1021/acs.jmedchem.9b01101

Lipovetsky S. and Conklin M., "Analysis of regression in game theory approach," Applied Stochastic Models in Business and Industry, vol. 17, pp. 319–330, 2001. DOI: https://doi.org/10.1002/asmb.446

Batunacun, Wieland R., Lakes T. and Nendel C., "Using Shapley additive explanations to interpret extreme gradient boosting predictions of grassland degradation in Xilingol, China," Geoscientific Model Development, vol. 14, pp. 1493–1510, 2021. DOI: https://doi.org/10.5194/gmd-14-1493-2021

Blalock H. M., "Correlation and causality: The multivariate case," Social Forces, vol. 39, pp. 246–251, 1961. DOI: https://doi.org/10.2307/2573216

Sobel M. E., "An introduction to causal inference," Sociological Methods & Research, pp. 353–379, 1996. DOI: https://doi.org/10.1177/0049124196024003004

Lin S.-H. and Ikram M. A., "On the relationship of machine learning with causal inference," European Journal of Epidemiology, vol. 35, pp. 183–185, 2019. https://doi.org/10.1007/s10654-019-00564-9 PMID: 31560086 DOI: https://doi.org/10.1007/s10654-019-00564-9

Glass T. A., Goodman S. N., Herna´n M. A. and Samet J. M., "Causal inference in public health," Annual Review of Public Health, vol. 34, pp. 61–75, 2013. https://doi.org/10.1146/annurev-publhealth-031811- 124606 PMID: 23297653 DOI: https://doi.org/10.1146/annurev-publhealth-031811-124606

Moser A., Puhan M. A. and Zwahlen M., "The role of causal inference in health services research I: Tasks in Health Services Research," International Journal of Public Health, pp. 227–230, 2020. https:// doi.org/10.1007/s00038-020-01333-2 PMID: 32052086 DOI: https://doi.org/10.1007/s00038-020-01333-2

A. Guedon, M. Paalvast, F. Meeuwsen, D. Tax, A. van Dijke, L. Wauben, et al. "Real-time estimation of surgical procedure duration," 2015 17th International Conference on E-health Networking, Application & Services (HealthCom), 2016. DOI: https://doi.org/10.1109/HealthCom.2015.7454464

Fritz B. A., Chen Y., Murray-Torres T. M., Gregory S., Ben Abdallah A., Kronzer A., et al. "Using machine learning techniques to develop forecasting algorithms for postoperative complications: Protocol for a retrospective study," BMJ Open, vol. 8, 2018. https://doi.org/10.1136/bmjopen-2017-020124 PMID: 29643160 DOI: https://doi.org/10.1136/bmjopen-2017-020124

Jiao Y., Sharma A., Ben Abdallah A., Maddox T. M. and Kannampallil T., "Probabilistic forecasting of surgical case duration using machine learning: Model Development and validation," Journal of the American Medical Informatics Association, vol. 27, pp. 1885–1893, 2020. https://doi.org/10.1093/ jamia/ocaa140 PMID: 33031543. DOI: https://doi.org/10.1093/jamia/ocaa140

Kougias P., Tiwari V., Orcutt S., Chen A., Pisimisis G., Barshes N. R., et al. "Derivation and out-of-sample validation of a modeling system to predict length of surgery," The American Journal of Surgery, vol. 204, pp. 563–568, 2012. https://doi.org/10.1016/j.amjsurg.2012.07.013 PMID: 23140826 DOI: https://doi.org/10.1016/j.amjsurg.2012.07.013

Srinivas S. and Ravindran A. R., "Optimizing outpatient appointment system using machine learning algorithms and scheduling rules: A prescriptive analytics framework," Expert Systems with Applications, vol. 102, pp. 245–261, 2018. DOI: https://doi.org/10.1016/j.eswa.2018.02.022

Srinivas S. and Salah H., "Consultation length and no-show prediction for improving appointment scheduling efficiency at a cardiology clinic: A data analytics approach," International Journal of Medical Informatics, vol. 145, p. 104290, 2021. https://doi.org/10.1016/j.ijmedinf.2020.104290 PMID: 33099184 DOI: https://doi.org/10.1016/j.ijmedinf.2020.104290

Cabitza F. and Campagner A., "The need to separate the wheat from the chaff in medical informatics," International Journal of Medical Informatics, vol. 153, p. 104510, 2021. DOI: https://doi.org/10.1016/j.ijmedinf.2021.104510

Raghunathan T. E., Lepkowski J. M., Van Hoewyk J. and Solenberger P., "A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models," Survey Methodology, vol. 27, pp. 58–95, 2001.

Angrist J. D., "Treatment Effect Heterogeneity in Theory and Practice," The Economic Journal, vol. 114, pp. C52–C83, 2004. DOI: https://doi.org/10.1111/j.0013-0133.2003.00195.x

Linden A., Uysal S. D., Ryan A. and Adams J. L., "Estimating causal effects for multivalued treatments: a comparison of approaches," Statistics in Medicine, vol. 35, pp. 534–552, 2015. https://doi.org/10. 1002/sim.6768 PMID: 26482211 DOI: https://doi.org/10.1002/sim.6768

Steiner P. M., Kim Y., Hall C. E. and Su D., "Graphical Models for Quasi-experimental Designs," Sociological Methods & Research, vol. 46, pp. 155–188, 2015. https://doi.org/10.1177/0049124115582272 PMID: 30174355 DOI: https://doi.org/10.1177/0049124115582272

M. Oprescu, V. Syrgkanis and Z. S. Wu, "Orthogonal Random Forest for Causal Inference," in roceedings of the 36th International Conference on Machine Learning, 2019.

Athey S., Tibshirani J. and Wager S., "Generalized random forests," The Annals of Statistics, 2019. DOI: https://doi.org/10.1214/18-AOS1709

Zhao S., van Dyk D. A. and Imai K., "Propensity score-based methods for causal inference in observational studies with non-binary treatments," Statistical Methods in Medical Research, pp. 709–727, 2020. https://doi.org/10.1177/0962280219888745 PMID: 32186266 DOI: https://doi.org/10.1177/0962280219888745

Alzubi J. A., Kumar A., Alzubi O. A. and Manikandan R., "Efficient approaches for prediction of brain tumor using machine learning techniques," Indian Journal of Public Health Research and Development, vol. 10, 2019. DOI: https://doi.org/10.5958/0976-5506.2019.00298.5

Waring J., Lindvall C. and Umeton R., "Automated machine learning: Review of the state-of-the-art and opportunities for healthcare," Artificial Intelligence in Medicine, vol. 104, p. 101822, 2020. https://doi. org/10.1016/j.artmed.2020.101822 PMID: 32499001 DOI: https://doi.org/10.1016/j.artmed.2020.101822

Mohamad H. A. A., "Agarwood oil QUALITY classification using support Vector classifier and grid Search cross Validation hyperparameter tuning," International Journal of Emerging Trends in Engineering Research, vol. 8, pp. 2551–2556, 2020. DOI: https://doi.org/10.30534/ijeter/2020/55862020