ICU Readmissions

Discriminative predictions using MIMIC III

DEMO

Background

Introduction:
The Affordable Care Act (ACA) has put hospitals under pressure to reduce readmissions of patients, since Centers for Medicare and Medicaid Services (CMS) announced their Hospital Readmissions Reduction Program (HRRP). The HRRP is an anti-incentivized program where there is up to a 3% financial penalty for hospitals from Medicare. HRRP went into effect in 2012 and only applies to admissions concerning acute myocardial infarction (AMI), heart failure (HF), and pneumonia (PN)[1].


With the ever increasing expansion of the HRRP program, hospitals are pressured to understand the underlying causes of readmission and make interventions early. Many hospitals have opted for a high touch solution involving social workers, case managers, paramedics, home nursing, and more. These resources need to be allocated intelligently otherwise the hospital's efforts could be wasted. Additionaally, there is a pivotal point in the patient care cycle at discharge. By informing providers at the point of discharge the likeliehood of readmission for a patient based on the level of care we can allow the provider to use his medical decision making to ensure the best patient care experience. By predicting likelihoods of readmission and providing that information to a provider along with several useful metrics for evaluation of patients, we can take first steps towards improving medical decision making and resource allocation to reduce readmission rates and improve the patient care experience [2,3,4,5].

Methods

This project was completed using the MIMIC-III dataset [6,7,8] available from MIT, including hospital admissions data from the ICU between 2002 and 2012 at Beth Israel Deaconess Medical Center (BIDMC). BIDMC is a 651 bed hospital with five different ICU units covering surgery, trauma, cardiac, neonatal, and general medicine. BIDMC is serviced by a level 1 trauma center with roof-top helicopter access. Approximately 54% of patients admitted to the ICU at BIDMC come through the emergency department or transfers from other hospitals, while the remaining are direct admits from physicians. It is important to note that long term care facilities likely bring patients into the emergency department who are subsequently admitted.

Readmission distribution by age and length of stay

Admitted From

Discharged To

Distribution of admissions by location

Model & Pipeline

Results & Conclusions

ROC Curves

A model was created without domain-specific feature engineering knowledge that is accepted by providers and is on-par with current research into prediction of hospital readmission by changing the focus from provider explainability of the model to model performance and metrics. This model is possible to implement at hospitals with most the work being focused on bringing data sources together efficiently. We hope to expand this work by partnering with hospitals on a specific EMR platform to expand the capabilities of the proof of concept model to a viable product. The model was built in such a way that it could be deployed to any system and have a pipeline create the initial model based on those systems for evaluation.


The initial ingestion results in a sizable feature space, with approximately 6,000 features. A L1 regularized logistic regression model was used to reduce the data into a smaller, more pertinent feature space. The pipeline supports a list of common algorithms—some with automated hyper parameter tuning—that yield reasonable results (AUC > 0.65): Logistic Regression, Random Forest, Extra Tree Classifier, Gradient Boosting and XGBoost. In practice, the models using the reduced dataset have approximately the same performance as the ones using the full dataset by reducing the sparsity of the model. These models were used to create an ensemble classifier, which does not improve the overall performance and hence was discarded to avoid overfitting due to increasing parameter space. The top performing models exhibit high specificity and have recall scores between 0.15 and 0.3. This shows that the models are conservative, and only predict a readmission when confidence is high – in line with the preferences of the medical providers surveyed.


The tree based models preformed significantly better (AUC > 0.78) than regression models which in turn outperformed Naïve Bayes. We attribute this to non-linearities being captured in the model since logistic regression and Naïve Bayes are both linear models in our implementation. This comes from the fact that the coefficients of the models are both linear while tree based models have branching and can capture different information. If the coefficients of a model are linear in the feature space then they cut the space into various lines thus the name. While if the coefficients are not linear it'll cut the space up into a different shape. We assume that healthcare and the health of a patient are effected by many non-linear processes making a linear model preform worse than a non-linear one.

Acknowledgments

We thank Dr. Alex Rudin, Dr. Jason James, Dr. Jason Begue, Winfield Winegar PA, Aaron Kidd PA, and Dr. Tony Briningstool for providing useful feedback on provider and hospital processes either in the ICU or in general.

References

  1. Hospital Readmissions Reduction Program (HRRP)(link)
  2. Aiming for Fewer Hospital U-turns: The Medicare Hospital Readmission Reduction Program (link)
  3. Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the community. (link)
  4. LACE+ index: extension of a validated index to predict early death or urgent readmission after hospital discharge using administrative data (link)
  5. Development and Implementation of a Real-Time 30-Day Readmission Predictive Model (link)
  6. Johnson, A., Pollard, T., & Mark, R. (2016). MIMIC-III Clinical Database (version 1.4). PhysioNet. https://doi.org/10.13026/C2XW26.
  7. Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., & Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. Scientific Data, 3, 160035.
  8. Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

Team - UC Berkeley MIDS Capstone Project

Sahab Aslam

Data Scientist, some startup

Jim Chen

Project Manager/Analyst, FedEx

Kyle Hamilton

Chief Innovation and Data Officer, iQ4

Tuhin Mahmud

Software Engineer, IBM

James York-Winegar

Director of IT, American Physician Partners