Effects of environment and globalization on the double and triple burdens of infection symptoms among under-five children across low-middle income countries using machine learning algorithms
Fenta, Haile Mekonnen; Amegah, A Kofi; Rantala, Aino K; Paciência, Inês; Jaakkola, Jouni J K (2025-11-20)
Fenta, Haile Mekonnen
Amegah, A Kofi
Rantala, Aino K
Paciência, Inês
Jaakkola, Jouni J K
Biomed central
20.11.2025
Fenta, H. M., Amegah, A. K., Rantala, A. K., Paciência, I., & Jaakkola, J. J. K. (2025). Effects of environment and globalization on the double and triple burdens of infection symptoms among under-five children across low-middle income countries using machine learning algorithms. Infectious Diseases of Poverty, 14(1), 117. https://doi.org/10.1186/s40249-025-01387-5
https://creativecommons.org/licenses/by/4.0/
© The Author(s) 2025. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
https://creativecommons.org/licenses/by/4.0/
© The Author(s) 2025. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
https://creativecommons.org/licenses/by/4.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202511216853
https://urn.fi/URN:NBN:fi:oulu-202511216853
Tiivistelmä
Abstract
Background:
Childhood infectious diseases and related symptoms, such as fever, cough, and diarrhea among children constitute the leading cause of death in low and middle-income countries (LMICs). We examined the environmental predictors of double and triple burden (D/TB) of infection symptoms among under-five children using multilevel machine learning (ML) methods.
Methods:
We used Demographic and Health Surveys (DHS) data from 58 LMICs between 2000 and 2023. These data were merged with cluster-level particulate matter and nitrogen dioxide from the National Aeronautics and Space Administration and country-level data on political, social, and economic globalization from the World Bank report. We applied multilevel models to screen out the most important predictors of D/TB symptoms and applied machine learning algorithms to predict these symptoms among children across LMICs. We trained and validated ML algorithms on (80, 70, and 60%) of the data and tested on the remaining (20, 30, and 40%) with 2, 5 and 10 cross-validations.
Results:
Of 1,546,243 children, 19.2%, 20.5% and 12.6% had fever, cough, and diarrhea, respectively; while the overall D/TB prevalence was 11.9% and 3.7%, respectively. The result revealed D/TB were associated with the location of a child, survey years, wealth index, family size, air pollutants, and environmental covariates. The estimated prevalence of both D/TB symptoms substantially varies across districts [intraclass correlation (intraclass correlation, ICC = 13.3%)] and countries (ICC = 8.8%). We found that the Random Forest gave the maximum Area Under the Curve of 94% and 99% for D/TBs for the K10 protocol and 80:20 training and testing dataset splits.
Conclusions:
The study found substantial variation in the prevalences of D/TB of illness among children under five and identified several environmental and sociodemographic predictors of these health outcomes. The Random Forest algorithm performed best in predicting these burdens. The study emphasized how integrating environmental and sociodemographic data with machine learning can enhance targeted interventions to reduce childhood infectious disease burdens in low- and middle-income countries.
Background:
Childhood infectious diseases and related symptoms, such as fever, cough, and diarrhea among children constitute the leading cause of death in low and middle-income countries (LMICs). We examined the environmental predictors of double and triple burden (D/TB) of infection symptoms among under-five children using multilevel machine learning (ML) methods.
Methods:
We used Demographic and Health Surveys (DHS) data from 58 LMICs between 2000 and 2023. These data were merged with cluster-level particulate matter and nitrogen dioxide from the National Aeronautics and Space Administration and country-level data on political, social, and economic globalization from the World Bank report. We applied multilevel models to screen out the most important predictors of D/TB symptoms and applied machine learning algorithms to predict these symptoms among children across LMICs. We trained and validated ML algorithms on (80, 70, and 60%) of the data and tested on the remaining (20, 30, and 40%) with 2, 5 and 10 cross-validations.
Results:
Of 1,546,243 children, 19.2%, 20.5% and 12.6% had fever, cough, and diarrhea, respectively; while the overall D/TB prevalence was 11.9% and 3.7%, respectively. The result revealed D/TB were associated with the location of a child, survey years, wealth index, family size, air pollutants, and environmental covariates. The estimated prevalence of both D/TB symptoms substantially varies across districts [intraclass correlation (intraclass correlation, ICC = 13.3%)] and countries (ICC = 8.8%). We found that the Random Forest gave the maximum Area Under the Curve of 94% and 99% for D/TBs for the K10 protocol and 80:20 training and testing dataset splits.
Conclusions:
The study found substantial variation in the prevalences of D/TB of illness among children under five and identified several environmental and sociodemographic predictors of these health outcomes. The Random Forest algorithm performed best in predicting these burdens. The study emphasized how integrating environmental and sociodemographic data with machine learning can enhance targeted interventions to reduce childhood infectious disease burdens in low- and middle-income countries.
Kokoelmat
- Avoin saatavuus [43406]

