MODELING MICROBIOLOGICAL COUNTS IN PURIFIED WATER AT A HEALTHCARE FACILITY USING ARIMA
DOI:
https://doi.org/10.55197/qjmhs.v4i3.158Keywords:
ARIMA, autocorrelation, bioburden, healthcare facility, Akaike information criterion corrected, purified waterAbstract
The microbiological quality of purified water is a crucial aspect in the healthcare industry to ensure safety for different applications and uses. Understanding the trend and forecasting would be of prime importance to take proactive control and protective measures before catastrophic excursions might occur leading financial and health casualties. This study analyzes microbial density, a key metric for monitoring water purification system efficacy in healthcare facilities. The objective was to transform irregular, cumulative data into a regular time series and identify the optimal ARIMA model for forecasting to support predictive maintenance and regulatory compliance. Preliminary modeling attempts were conducted using simpler approaches such as linear, exponential and Holt-Winters methods without showing promising outcomes. Descriptive statistics and distribution analysis, including the Johnson Transformation for normality, were performed. ARIMA models with differencing orders d=0, d=1, and d=2 were fitted to the Aggregated cumulative logarithmically transformed data series, with the best model at each order selected based on minimum AICc. Model adequacy was assessed through parameter significance and residual diagnostics (Ljung-Box test). Descriptive statistics showed the aggregated series non-normal (p<0.005). ARIMA(5, 0, 4) (d=0) performed poorly (AICc=319.39) with residual autocorrelation. ARIMA(0, 2, 1) (d=2) showed improved fit (AICc=258.98) and white noise residuals (p>0.5). The ARIMA(2, 1, 2) model (d=1) was optimal (AICc=256.91), with all significant parameters and white noise residuals (p>0.3), effectively addressing non-stationarity. Forecasts from ARIMA(2, 1, 2) predict stable future growth. The ARIMA(2, 1, 2) model with first-order differencing is the most appropriate and robust model for forecasting data trends. Its strong statistical fit and reliable residual properties make it a valuable tool for predictive maintenance, optimizing resources, and enhancing patient safety in healthcare water systems, provided model performance is continuously monitored. Addressing data limitations and processing requires monitoring and exploring alternative models for future improvement.
References
[1] Alsheheri, G. (2025): Comparative Analysis of ARIMA and NNAR Models for Time Series Forecasting. – Journal of Applied Mathematics and Physics 13(1): 267-280.
[2] Anhøj, J., Olesen, A.V. (2014): Run charts revisited: a simulation study of run chart rules for detection of non-random variation in health care processes. – PLoS One 9(11): 13p.
[3] Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M. (2015): Time series analysis: forecasting and control. – John Wiley & Sons 720p.
[4] Chatfield, C., Xing, H. (2019): The analysis of time series: an introduction with R. – Chapman and hall/CRC 414p.
[5] Chidiac, S., El Najjar, P., Ouaini, N., El Rayess, Y., El Azzi, D. (2023): A comprehensive review of water quality indices (WQIs): history, models, attempts and perspectives. – Reviews in Environmental Science and Bio/Technology 22(2): 349-395.
[6] Dishan, A., Barel, M., Hizlisoy, S., Arslan, R.S., Hizlisoy, H., Gundog, D.A., Al, S., Gonulalan, Z. (2024): The ARIMA model approach for the biofilm-forming capacity prediction of Listeria monocytogenes recovered from carcasses. – BMC Veterinary Research 20(1): 13p.
[7] Durbin, J., Watson, G.S. (1992): Testing for serial correlation in least squares regression. – In Breakthroughs in Statistics: Methodology and Distribution, New York, NY: Springer New York 22p.
[8] Eissa, M. (2024): Bioburden analysis and microbiological stability of municipal distribution system through examination of transformed total microbial count dataset. – Frontiers in Scientific Research and Technology 8(1): 14p.
[9] Eissa, M., Rashed, E., Eissa, D. (2022): Principal component analysis in long term assessment of total viable plate count of municipal water distribution network system in healthcare facility. – Environmental Research and Technology 5(2): 165-171.
[10] Eissa, M.E., Rashed, E.R., Eissa, D.E. (2023): Case of preferential selection of attribute over variable control charts in trend analysis of microbiological count in water. – Acta Natura et Scientia 4(1): 1-9.
[11] FasterCapital Web Portal (2025): Differencing: Bridging the Gap: The Power of Differencing in Time Series Analysis. – FasterCapital Web Portal 27p.
[12] Feldl, M., Olayo-Alarcon, R., Amstalden, M.K., Zannoni, A., Peschel, S., Sharma, C.M., Brochado, A.R., Müller, C.L. (2025): Statistical end-to-end analysis of large-scale microbial growth data with DGrowthR. – BioRxiv 18p.
[13] Hilbe, J.M. (2011): Negative binomial regression. – Cambridge University Press 553p.
[14] Herrera-González, J. L., Rodríguez-Venegas, R., Legarreta-González, M. A., Robles-Trillo, P. A., De-Santiago-Miramontes, Á., Loya-González, D., & Rodríguez-Martínez, R. (2024). Time series (ARIMA) as a tool to predict the temperature-humidity index in the dairy region of the northern desert of Mexico. – PeerJ 12: 13p.
[15] Hyndman, R., Koehler, A.B., Ord, J.K., Snyder, R.D. (2008): Forecasting with exponential smoothing: the state space approach. – Springer Science & Business Media 360p.
[16] Johnson, N.L. (1949): Systems of Frequency Curves Generated by Methods of Translation. – Biometrika 36(1/2): 149-176.
[17] Kanamori, H., Weber, D.J., Rutala, W.A. (2016): Healthcare outbreaks associated with a water reservoir and infection prevention strategies. – Clinical Infectious Diseases 62(11): 1423-1435.
[18] Kirkwood, B.R., Sterne, J.A. (2010): Essential medical statistics. – John Wiley & Sons 516p.
[19] Kmentt, L., Cronk, R., Tidwell, J.B., Rogers, E. (2021): Water, sanitation, and hygiene (WASH) in healthcare facilities of 14 low-and middle-income countries: to what extent is WASH implemented and what are the ‘drivers’ of improvement in their service levels? – H2Open Journal 4(1): 129-137.
[20] Lemenkova, P. (2019): Generic Mapping Tools and Matplotlib Package of Python for Geospatial Data Analysis in Marine Geology. – International Journal of Environment and Geoinformatics 6(3): 225-237.
[21] Ljung, G.M., Box, G.E.P. (1978): On a Measure of Lack of Fit in Time Series Models. – Biometrika 65(2): 297-303.
[22] Lokanan, M. (2024): Harnessing Exploratory Data Analysis for Robust Financial Fraud Detection and Model Enhancement. – SSRN Electronic Journal 49p.
[23] Mazerolle, M.J. (2023): Package ‘AICcmodavg’: Model Selection and Multimodel Inference Based on (Q)AIC(c). – CRAN 229p.
[24] Lange, H., Hauhs, M. (2025): Complexity Analysis of Environmental Time Series. – Entropy 27(4): 31p.
[25] Portet, S. (2020): A primer on model selection using the Akaike Information Criterion. – Infectious Disease Modelling 5: 111-128.
[26] Razali, N.M., Wah, Y.B. (2011): Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. – Journal of Statistical Modeling and Analytics 2(1): 21-33.
[27] Sakia, R.M. (1992): The Box-Cox transformation technique: a review. – Journal of the Royal Statistical Society: Series D (The Statistician) 41(2): 169-178.
[28] Wang, S., Xu, O. (2025): Confidence interval forecasting model of small watershed flood based on compound recurrent neural networks and Bayesian. – PloS One 20(4): 24p.
[29] World Health Organization (WHO) (2022): Guidelines for drinking-water quality: incorporating the first and second addenda. – World Health Organization 614p.
[30] Xue, K. (2024): Prediction and Analysis of Total Nitrogen in a Sewage Treatment Plant Effluent. – Open Journal of Modelling and Simulation 12(4): 114-129.
[31] Yetiş, Ö., Ali, S., Karia, K., Bassett, P., Wilson, P. (2023): Enhanced monitoring of healthcare shower water in augmented and non-augmented care wards showing persistence of Pseudomonas aeruginosa despite remediation work. – Journal of Medical Microbiology 72(5): 12p.
[32] Zafra-Mejía, C.A., Rondón-Quintana, H.A., Urazán-Bonells, C.F. (2024): ARIMA and TFARIMA Analysis of the Main Water Quality Parameters in the Initial Components of a Megacity’s Drinking Water Supply System. – Hydrology 11(1): 19p.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 MOSTAFA ESSAM EISSA

This work is licensed under a Creative Commons Attribution 4.0 International License.