Predicting the risk of acute care readmissions among rehabilitation inpatients: A machine learning approach


Introduction: Readmission from inpatient rehabilitation facilities to acute care hospitals is a serious problem. This study aims to develop a predictive model based on machine learning algorithms to identify patients at high risk of readmission. Methods: A retrospective dataset (2001–2017) including 16,902 patients admitted into a large inpatient rehabilitation facility in North Carolina was collected in 2017. Three types of machine learning models with different predictors were compared in 2018. The model with the highest c-statistic was selected as the best model and further tested by using five sets of training and validation data with different split time. The optimum threshold for classification was identified. Results: The logistic regression model with only functional independence measures has the highest validation c-statistic at 0.852. Using this model to predict the recent 5 years acute care readmissions yielded high discriminative ability (c-statistics: 0.841–0.869). Larger training data yielded better performance on the test data. The default cutoff (0.5) resulted in high specificity (>0.997) but low sensitivity (<0.07). The optimum threshold helped to achieve a balance between sensitivity (0.754–0.867) and specificity (0.747–0.780). Conclusions: This study demonstrates that functional independence measures can be analyzed by using machine learning algorithms to predict acute care readmissions, thus improving the effectiveness of preventive medicine.

Publication Title

Journal of Biomedical Informatics