Application of time series analysis and machine learning to predict length of stay in pediatric ICUs-Intensive Care Units

4 min readSep 23, 2020

(This article is based on a recent paper that was published at Respiratory Care and authored by David Castiñeira, Katherine R Schlosser, Alon Geva, Amir R Rahmani, Gaston Fiore, Brian K Walsh, Craig D Smallwood, John H Arnold and Mauricio Santillana)

What: We created a machine learning-based approach capable of extracting meaningful information from continuous-in-time vital sign information from bedside monitors, from the first 24 h of a subject’s ICU stay while on mechanical ventilation, to predict prolonged LOS. Our findings showed that combining subjects’ static clinical data and continuous-in-time data from vital signs led to improved predictions. The framework introduced in this work was efficient and scalable, and has the potential to be implemented in real-time settings as a decision-making support tool. Also, we described a comprehensive evaluation framework that ensures the accuracy and robustness of our results.

Figure: Proposed methodology for feature engineering. GBT = Gradient Boosting Trees; LOS = Length of Stay.

Why: Bedside monitors in ICUs measure patients’ physiologic data in real time to continuously assess the health status of patients who are critically ill. These continuous-in-time monitored values often include vital signs (heart rate, breathing frequency, and oxygenation levels), electrocardiogram tracings, and mechanical ventilation parameters (FIO2 PEEP, peak inspiratory pressure) during a patient’s visit. The continuous monitoring of data is designed to help clinicians intervene in a timely manner if a patient experiences deterioration in his or her health status; however, the massive amount of data displayed in a large ICU taxes human cognition.

With the recent advent of increased computational power and the ability to store and rapidly process large data sets, these continuous-in-time data show promise to be used not only as brief snapshots of information routinely absorbed by clinicians during their rounds but also to identify subjects’ health trends and predict events and specific outcomes during their ICU stay. This additional information may then be translated into early warning systems that may help improve the care of future patients.

How: We introduced a methodology designed to automatically extract information (features) from continuous-in-time vital sign data collected from bedside monitors to predict patient outcomes via unsupervised and supervised machine learning techniques. We applied this approach in a pilot study aimed at predicting the likelihood that a patient will experience a prolonged stay. Specifically, we showed (1) that continuous-in-time monitor data from the first 24 hours of a patient’s ICU stay while on mechanical ventilation have meaningful predictive power to identify stays > 4 d, (2) how model performance was improved when combining subjects’ static clinical data and continuous-in-time data from vital signs, (3) a parsimonious and scalable machine learning workflow that could be implemented in a real-time setting, and (4) a performance evaluation framework capable of showing that our predictive modeling approach led to robust findings and ensured the validity of our results in the face of a relatively small sample size. The predictive power of our approach outperforms recent efforts by Google (Google, Mountain View, California) (Google’s area under the curve of 0.85 — 0.86 compared with our area under the curve of 0.9, 95% CI [0.80 — 0.96]) to predict a similar task.

Figure: All scenarios with total accuracy. Receiver operating characteristic curves for the 3 data types. B: Static clinical data. C: Time series data. D: Static clinical data plus time series data. The 10th, 50th, and 90th percentile curves are shown (P10, P50, P90).

Figure: Accuracies obtained with gradient boosting trees predictive models. The 10th and 90th percentiles are provided (P10, P90). LOS ¼ Length of stay

Value: Our study illustrated the utility of continuous-in-time information from bedside monitors when analyzed through the lens of a machine learning framework and its potential application to resource allocation decision-making. The accuracy of our predictive modeling approach demonstrates how we may design real-time computer-generated decision-support systems. Moreover, our evaluation approach ensured the validity and robustness of our findings. Specifically, and from a clinical standpoint, our modeling approach has an interesting property: it minimizes the number of false negatives, so that patients who are likely to experience prolonged ICU stays after mechanical ventilation are less frequently misclassified. From a resource allocation perspective, this model accurately predicted ICU beds that would not be available in the following half-week, which, for example, may help improve surgical scheduling or respiratory therapy or nurse staffing

Our approach was novel in the sense that it could identify patterns in the subjects’ vital sign trends, not previously identified in the literature, to provide an early signal or indicator of an event or outcome. Based on unsupervised and supervised machine learning approaches, the workflow behind our strategy is scalable because it can produce a prediction with a reduced set of important features from the vital sign information of a given unseen patient, identified a priori, during the training model design.

Our predictive platform provides the potential to scale up to allow a continuous learning experience from every new patient treated. Moreover, this methodology could easily be extended to predict other outcomes by using continuous-in-time information from bedside monitors.

Application of time series analysis and machine learning to predict length of stay in pediatric ICUs-Intensive Care Units

Written by David Castiñeira