Article section
Machine-Learning Prognostic Models From The 2018–2020 Ebola Outbreak in Democratic Republic of Congo
Abstract
The Ebola virus disease epidemic in the Democratic Republic of the Congo (DRC) from 2018 to 2020 resulted in 3481 cases (both suspected and confirmed) and 2299 deaths. The WHO declared the sickness a global health emergency. The majority of the patients were known to have died before the antibodies could respond. This highlights the need to improve the disease’s diagnosis and prediction tools. The goal of this paper is to assess and enhance the accuracy of Ebola prediction algorithms using a variety of inputs. The input is based on the patient’s symptoms in the early stages of the condition. Data mining techniques used in this study include Decision Trees, KNN, Support Vector Machine, Random Forest, and Gradient Boosting classifier. The experimental findings illustrate the accuracy of each classification technique, with Support Vector Classification providing the best predictive model for both diagnosis and prognosis with 0.88 accuracy. We will include these models into an Ebola prediction web app with an API in Flask (Python), which will aid medical practitioners and people in the early diagnosis of illness.
Keywords:
Decision Tree Ebola Virus Hybrid Neural Network KNN Random Support Vector Machine WHO
Article information
Journal
Scientific Journal of Engineering, and Technology
Volume (Issue)
2(1), (2025)
Pages
101-111
Published
Copyright
Copyright (c) 2025 Kalema Josue Djamba, Mugisha Sebakunzi Prince, Vincent Havyarimana, Lumande Kingutse Josue, Ciza Murhula Blaise, Kalema Daniel Jonathan (Author)
Open access

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
References
Abisoye, O., & Jimoh, G. (2019). Comparative Study on the Prediction of Symptomatic and Climatic based Malaria Parasite Counts Using Machine Learning Models. I. J. Modern Education and Computer Science, 4, 18-25.
Anantpadma, M., Lane, T., Zorn, M., Lingerfelt, A., Clark, M., & Freundlich, S. (2019). Ebola Virus Bayesian Machine Learning Models Enable New in Vitro Leads. ACS omega, 4(1), 2353-2361.
Branswell, H. (May 23, 2018). Excitement over use of Ebola vaccine in outbreak tempered by real-world challenges. Stat, Health. Retrieved from https://www.statnews.com/2018/05/23/ebola-vaccine-drc-real-world-challenges
Broadhurst, J., Brooks, J., & Pollock, R. (2018). Diagnosis of Ebola virus disease: past, present, and future. Clinical microbiology reviews, 29(4), 773-793.
Brown, C., & Johnson, O. (2019). Introduction to Viral Haemorrhagic Fevers. In Ebola Virus Disease. Springer, Cham.
Brownlee, J. (2023, 11 05). Develop k-Nearest Neighbors in Python From Scratch. Develop k-Nearest Neighbors in Python From Scratch. https://machinelearningmastery.com/tutorial-to-implement-k-nearest-nearestneighbors-/
CDCP. (August 23, 2023). Centers for Disease Control and Prevention. Retrieved from www.cdc.gov: https://www.cdc.gov/mmwr/volumes/68/wr/mm6850a3.htm
Cheng, Y. Y., Chan, P. P., & Qiu, Z. W. (2012, July). Random forest based ensemble system for short term load forecasting. In 2012 international conference on machine learning and cybernetics (Vol. 1, pp. 52-56). IEEE.
Chowell, G., Sattenspiel, L., Bansal, S., & Viboud, C. (2016). Mathematical models to characterize early epidemic growth: A review. Physics of life reviews, 18, 66-97.
Chuchra, K., & Chhabra, A. (2019 September). Evaluating the performance of tree based classifiers using Ebola virus dataset. International Conference on Next Generation Computing Technologies (NGCT) (pp. 494-499).
Colubri, A., Hartley, M., Matthew, S., W., V., August, F., Tom, S., & Jeffrey, G. (2019). Machine-learning Prognostic Models from the 2014–16 Ebola Outbreak: Dataharmonization (pp. 54–64). Elsevier.
Colubri, A., Silver, T., Fradet, T., Retzepi, K., Fry, B., & Sabeti, P. (2018). Transforming clinical data into actionable prognosis models: machine-learning framework and field-deployable app to predict outcome of Ebola patients. PLoS neglected tropical diseases, 3, 10.
Djamba, K. J. (2022). Cloud-Based Centralizing system for academic history, plagiarism prevention management in Higher Education Institution IN DRC: Benefit, Challenges. British Journal of Multidisciplinary and Advanced Studies, 3(2), 142-152.
Djamba, K. J., & Irene, B. N. (2024). Itegration d’une application mobile au systeme de regulation du niveau d’eau d’un reservoir. British Journal of Multidisciplinary and Advanced Studies, 5(1), 8-22.
Djamba, K. J., Havyarimana, V., Mbambazi, B. P., & Niyongabo, J. (2025). E-Health Implementation in the Democratic Republic of the Congo: Current Position. International Journal of Health Sciences, 9, 210-222.
Dudzik, S. (2012). Application of the naive Bayes classifier to defect characterization using active thermography. Journal of Nondestructive Evaluation, 383-392.
Etter, P. (2019). Model evaluation. Underw Acoust Model (261-278).
EVD. (2023, June 26). World Health Organization Regional Office for Africa. Health topics. World Health Organization Regional Office for Africa. http://www.afro.who.int/health-topics/ebola-virus-disease
Figueroa, M. E. (2017). A theory-based socioecological model of communication and behavior for the containment of the Ebola epidemic in Liberia. Journal of Health Communication, 22(sup1), 5-9.
Funk, S., Ciglenecki, I., Tiffany, A., Gignoux, E., Camacho, A., Eggo, R. M., ... & Reeder, B. (2017). The impact of control strategies and behavioural changes on the elimination of Ebola from Lofa County, Liberia. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1721), 20160302.
JavaTPoint. (May 26, 2023). Machine Learning. https://www.javatpoint.com/machine-learning
Kelly, J. D., Barrie, M. B., Mesman, A. W., Karku, S., Quiwa, K., Drasher, M., ... & Richardson, E. T. (2018). Anatomy of a hotspot: chain and seroepidemiology of Ebola virus transmission, Sukudu, Sierra Leone, 2015–16. The Journal of infectious diseases, 217(8), 1214-1221.
Kotsiantis, B., Zaharakis, I., & Pintelas, P. (2017). Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 3(24).
Krauer, F., Gsteiger, S., Low, N., Hansen, C. H., & Althaus, C. L. (2016). Heterogeneity in district-level transmission of Ebola virus disease during the 2013-2015 epidemic in West Africa. PLoS neglected tropical diseases, 10(7), e0004867.
Kurama, V. (May 23, 2023). Gradient Boosting In Classification: Not a Black Box Anymore. https://blog.paperspace.com/gradient-boosting-for-classification
Lin, W., Wu, Z., Lin, L., Wen, A., & Li, J. (2017). An ensemble random forest algorithm for insurance big data analysis. IEEE Access, 5, 16568–16575. https://doi.org/10.1109/ACCESS.2017.2738069
Masinde, M. (2020, March). Africa’s Malaria epidemic predictor: Application of machine learning on malaria incidence and climate data. In Proceedings of the 2020 4th International Conference on Compute and Data Analysis (pp. 29-37).
Munappy, A. R., Bosch, J., Olsson, H. H., Arpteg, A., & Brinne, B. (2022). Data management for production quality deep learning models: Challenges and solutions. Journal of Systems and Software, 191, 111359.
Navlani, A. (April 11, 2022). Understanding Logistic Regression in Python. https://www.datacamp.com/community/tutorials/understanding-logistic-regression-python
Nelson, D. (May 23, 2023). Gradient Boosting Classifiers in Python with Scikit-Learn. https://stackabuse.com/gradient-boosting-classifiers-in-python-with-scikit-LEARN
Pedamkar, P. (July 31, 2023). Data mining method. Educba. https://www.educba.com/data-mining-methods
Robinson, A. (May 15, 2023). How to Calculate Euclidean Distance. https://sciencing.com/how-to-calculate-euclidean-distance-12751761.html
Sharma, S., & Mangat, V. (September, 2017). Relevance Vector Machine classification for big data on Ebola outbreak. 1st International Conference on Next Generation Computing Technologies (NGCT) IEEE (pp. 639-643).
Shubham, A. (April 11, 2022). Decision Tree. https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/mlmldecision
Stojiljković, M. (April 11, 2022). Logistic Regression in Python. https://realpython.com/logistic-regression-python
USCW. (2019). Ebola outbreak In DRC: Second-Largest Outbreak in History Rages in Congo. https://www.concernusa.org/story/ebola-outbreak-in-drc/.
WHO. (2016). After Ebola in West Africa—unpredictable risks, preventable epidemics. New England Journal of Medicine, 375(6), 587-596.
WHO. (2022). Situation on the current outbreak in North Kivu (2018-2019). https://www.who.int/ebola/situation-reports/drc-2018/en/
Zhang, P., Chen., B., Ma, L., Li, Z., Song, Z., Duan, W., & Qiu, X. (2017). Relevance Vector Machine classification for big data on Ebola outbreak. In 2015 1st International Conference on Next Generation Computing Technologies (NGCT) (pp. 639-643). IEEE.