Research Article

Machine-Learning Prognostic Models From The 2018–2020 Ebola Outbreak in Democratic Republic of Congo

Authors

  • Kalema Josue Djamba University of Burundi, Bujumbura, Burundi

    josuekalema@gmail.com

  • Mugisha Sebakunzi Prince Department of Computer Engineering, Institut Superieur de Commerce, Goma Town, DRC
  • Vincent Havyarimana Ecole Normale Superieur, University of Burundi, Bujumbura, Burundi
  • Lumande Kingutse Josue Department of Computer Engineering, Institut Superieur de Commerce, Kiwanja City, DRC
  • Ciza Murhula Blaise Department of Computer Science, Institut Superieur d’Informatique et de Gestion, Goma Town, DRC
  • Kalema Daniel Jonathan Independent Research, ULB Cooperation, Goma Town, DRC

Abstract

The Ebola virus disease epidemic in the Democratic Republic of the Congo (DRC) from 2018 to 2020 resulted in 3481 cases (both suspected and confirmed) and 2299 deaths. The WHO declared the sickness a global health emergency. The majority of the patients were known to have died before the antibodies could respond. This highlights the need to improve the disease’s diagnosis and prediction tools. The goal of this paper is to assess and enhance the accuracy of Ebola prediction algorithms using a variety of inputs. The input is based on the patient’s symptoms in the early stages of the condition. Data mining techniques used in this study include Decision Trees, KNN, Support Vector Machine, Random Forest, and Gradient Boosting classifier. The experimental findings illustrate the accuracy of each classification technique, with Support Vector Classification providing the best predictive model for both diagnosis and prognosis with 0.88 accuracy. We will include these models into an Ebola prediction web app with an API in Flask (Python), which will aid medical practitioners and people in the early diagnosis of illness.

Keywords:

Decision Tree Ebola Virus Hybrid Neural Network KNN Random Support Vector Machine WHO

Article information

Journal

Scientific Journal of Engineering, and Technology

Volume (Issue)

2(1), (2025)

Pages

101-111

Published

20-05-2025

How to Cite

Djamba, K. J., Prince, M. S., Havyarimana, V., Josue, L. K., Blaise, C. M., & Jonathan, K. D. (2025). Machine-Learning Prognostic Models From The 2018–2020 Ebola Outbreak in Democratic Republic of Congo. Scientific Journal of Engineering, and Technology, 2(1), 101-111. https://doi.org/10.69739/sjet.v2i1.470

References

Abisoye, O., & Jimoh, G. (2019). Comparative Study on the Prediction of Symptomatic and Climatic based Malaria Parasite Counts Using Machine Learning Models. I. J. Modern Education and Computer Science, 4, 18-25.

Anantpadma, M., Lane, T., Zorn, M., Lingerfelt, A., Clark, M., & Freundlich, S. (2019). Ebola Virus Bayesian Machine Learning Models Enable New in Vitro Leads. ACS omega, 4(1), 2353-2361.

Branswell, H. (May 23, 2018). Excitement over use of Ebola vaccine in outbreak tempered by real-world challenges. Stat, Health. Retrieved from https://www.statnews.com/2018/05/23/ebola-vaccine-drc-real-world-challenges

Broadhurst, J., Brooks, J., & Pollock, R. (2018). Diagnosis of Ebola virus disease: past, present, and future. Clinical microbiology reviews, 29(4), 773-793.

Brown, C., & Johnson, O. (2019). Introduction to Viral Haemorrhagic Fevers. In Ebola Virus Disease. Springer, Cham.

Brownlee, J. (2023, 11 05). Develop k-Nearest Neighbors in Python From Scratch. Develop k-Nearest Neighbors in Python From Scratch. https://machinelearningmastery.com/tutorial-to-implement-k-nearest-nearestneighbors-/

CDCP. (August 23, 2023). Centers for Disease Control and Prevention. Retrieved from www.cdc.gov: https://www.cdc.gov/mmwr/volumes/68/wr/mm6850a3.htm

Cheng, Y. Y., Chan, P. P., & Qiu, Z. W. (2012, July). Random forest based ensemble system for short term load forecasting. In 2012 international conference on machine learning and cybernetics (Vol. 1, pp. 52-56). IEEE.

Chowell, G., Sattenspiel, L., Bansal, S., & Viboud, C. (2016). Mathematical models to characterize early epidemic growth: A review. Physics of life reviews, 18, 66-97.

Chuchra, K., & Chhabra, A. (2019 September). Evaluating the performance of tree based classifiers using Ebola virus dataset. International Conference on Next Generation Computing Technologies (NGCT) (pp. 494-499).

Colubri, A., Hartley, M., Matthew, S., W., V., August, F., Tom, S., & Jeffrey, G. (2019). Machine-learning Prognostic Models from the 2014–16 Ebola Outbreak: Dataharmonization (pp. 54–64). Elsevier.

Colubri, A., Silver, T., Fradet, T., Retzepi, K., Fry, B., & Sabeti, P. (2018). Transforming clinical data into actionable prognosis models: machine-learning framework and field-deployable app to predict outcome of Ebola patients. PLoS neglected tropical diseases, 3, 10.

Djamba, K. J. (2022). Cloud-Based Centralizing system for academic history, plagiarism prevention management in Higher Education Institution IN DRC: Benefit, Challenges. British Journal of Multidisciplinary and Advanced Studies, 3(2), 142-152.

Djamba, K. J., & Irene, B. N. (2024). Itegration d’une application mobile au systeme de regulation du niveau d’eau d’un reservoir. British Journal of Multidisciplinary and Advanced Studies, 5(1), 8-22.

Djamba, K. J., Havyarimana, V., Mbambazi, B. P., & Niyongabo, J. (2025). E-Health Implementation in the Democratic Republic of the Congo: Current Position. International Journal of Health Sciences, 9, 210-222.

Dudzik, S. (2012). Application of the naive Bayes classifier to defect characterization using active thermography. Journal of Nondestructive Evaluation, 383-392.

Etter, P. (2019). Model evaluation. Underw Acoust Model (261-278).

EVD. (2023, June 26). World Health Organization Regional Office for Africa. Health topics. World Health Organization Regional Office for Africa. http://www.afro.who.int/health-topics/ebola-virus-disease

Figueroa, M. E. (2017). A theory-based socioecological model of communication and behavior for the containment of the Ebola epidemic in Liberia. Journal of Health Communication, 22(sup1), 5-9.

Funk, S., Ciglenecki, I., Tiffany, A., Gignoux, E., Camacho, A., Eggo, R. M., ... & Reeder, B. (2017). The impact of control strategies and behavioural changes on the elimination of Ebola from Lofa County, Liberia. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1721), 20160302.

JavaTPoint. (May 26, 2023). Machine Learning. https://www.javatpoint.com/machine-learning

Kelly, J. D., Barrie, M. B., Mesman, A. W., Karku, S., Quiwa, K., Drasher, M., ... & Richardson, E. T. (2018). Anatomy of a hotspot: chain and seroepidemiology of Ebola virus transmission, Sukudu, Sierra Leone, 2015–16. The Journal of infectious diseases, 217(8), 1214-1221.

Kotsiantis, B., Zaharakis, I., & Pintelas, P. (2017). Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 3(24).

Krauer, F., Gsteiger, S., Low, N., Hansen, C. H., & Althaus, C. L. (2016). Heterogeneity in district-level transmission of Ebola virus disease during the 2013-2015 epidemic in West Africa. PLoS neglected tropical diseases, 10(7), e0004867.

Kurama, V. (May 23, 2023). Gradient Boosting In Classification: Not a Black Box Anymore. https://blog.paperspace.com/gradient-boosting-for-classification

Lin, W., Wu, Z., Lin, L., Wen, A., & Li, J. (2017). An ensemble random forest algorithm for insurance big data analysis. IEEE Access, 5, 16568–16575. https://doi.org/10.1109/ACCESS.2017.2738069

Masinde, M. (2020, March). Africa’s Malaria epidemic predictor: Application of machine learning on malaria incidence and climate data. In Proceedings of the 2020 4th International Conference on Compute and Data Analysis (pp. 29-37).

Munappy, A. R., Bosch, J., Olsson, H. H., Arpteg, A., & Brinne, B. (2022). Data management for production quality deep learning models: Challenges and solutions. Journal of Systems and Software, 191, 111359.

Navlani, A. (April 11, 2022). Understanding Logistic Regression in Python. https://www.datacamp.com/community/tutorials/understanding-logistic-regression-python

Nelson, D. (May 23, 2023). Gradient Boosting Classifiers in Python with Scikit-Learn. https://stackabuse.com/gradient-boosting-classifiers-in-python-with-scikit-LEARN

Pedamkar, P. (July 31, 2023). Data mining method. Educba. https://www.educba.com/data-mining-methods

Robinson, A. (May 15, 2023). How to Calculate Euclidean Distance. https://sciencing.com/how-to-calculate-euclidean-distance-12751761.html

Sharma, S., & Mangat, V. (September, 2017). Relevance Vector Machine classification for big data on Ebola outbreak. 1st International Conference on Next Generation Computing Technologies (NGCT) IEEE (pp. 639-643).

Shubham, A. (April 11, 2022). Decision Tree. https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/mlmldecision

Stojiljković, M. (April 11, 2022). Logistic Regression in Python. https://realpython.com/logistic-regression-python

USCW. (2019). Ebola outbreak In DRC: Second-Largest Outbreak in History Rages in Congo. https://www.concernusa.org/story/ebola-outbreak-in-drc/.

WHO. (2016). After Ebola in West Africa—unpredictable risks, preventable epidemics. New England Journal of Medicine, 375(6), 587-596.

WHO. (2022). Situation on the current outbreak in North Kivu (2018-2019). https://www.who.int/ebola/situation-reports/drc-2018/en/

Zhang, P., Chen., B., Ma, L., Li, Z., Song, Z., Duan, W., & Qiu, X. (2017). Relevance Vector Machine classification for big data on Ebola outbreak. In 2015 1st International Conference on Next Generation Computing Technologies (NGCT) (pp. 639-643). IEEE.

Downloads

Views

58

Downloads

33