Skip to main content

Use of Neighborhood-Level Socioeconomic Status to Predict Patient-Level Health Care Utilization, Mortality

November 05, 2020

By Julie Gould

schulerAccording to findings published online in JAMA Network Open, the use of neighborhood-level socioeconomic status indicators did not improve the patient-level prediction of one-year health care utilization or mortality across a variety of scenarios. 

“Prediction models are widely used in health care as a way of risk stratifying populations for targeted intervention. Most risk stratification has been done using a small number of predictors from insurance claims,” wrote the researchers of the study.  

To better understand the new machine learning models built for this study based on outcomes with and without the socioeconomic variables that also compared predictive performance, we spoke with Alejandro Schuler, PhD, statistician, Unlearn.AI. Despite these current study findings, Dr Schuler explains why socioeconomics play a critical role in how health care is used and dispensed in the United States in order to improve long-term population health.   

What existing data led you and your co-investigators to conduct this research?

This research spun out of a project created to better understand and forecast population-level shifts in health care utilization. We created prediction models for health care utilization based on a variety of patient-level data available at Kaiser as well as some socioeconomic population data pulled from external sources. This work was of operational value to the organization, but what was more novel from a scientific perspective was our use of the socioeconomic data alongside patient-level predictors for health care utilization.  

Please briefly describe your study and its findings. Were any of the outcomes particularly surprising?

Our study found that neighborhood-level socioeconomic status indicators did not improve the patient-level prediction of one-year health care utilization or mortality across a variety of scenarios. We determined this by building machine learning models of these outcomes with and without the socioeconomic variables and comparing predictive performance. The findings are interesting because a number of prior studies have found that socioeconomic variables are statistically associated with many clinical outcomes. Our study’s finding is apparently at odds with this, but really it is not. For one, most of the prior work focuses on statistical significance, not on predictive performance- in other words, socioeconomic variables may be detectably associated with clinical outcomes, but the effects on the ability to predict a patient’s future may be practically very small. Secondly, most prior work has focused on long-term and clinical outcomes, whereas our work is focused on short-term (one-year) operational outcomes. It stands to reason that socioeconomic factors have long-term health effects, but are less useful for predicting how a patient will rely on the health care system in the next year.  

What are the possible real-world applications of these findings in clinical practice?

The real-world implications of this work are that anyone modeling short-term health care utilization should not worry too much about obtaining data on patient socioeconomic status—it is not likely to improve the predictive performance of the model. 

Do you and your co-investigators intend to expand upon this research?

There is a perennial interest in both short-term utilization modeling and understanding the role that socioeconomics plays in health care. However, our particular contribution tying these two together is self-contained for the time being.

Is there anything else pertaining to your research and findings that you would like to add?

Socioeconomics play a critical role in how health care is used and dispensed in the United States. Although our study found no utility in using socioeconomic variables in our particular use case, it is already obvious that disparities exist and must be politically remedied in order to improve long-term population health. 

About Dr Schuler:

Alejandro Schuler, PhD, received his PhD in Biomedical Informatics from Stanford University, Palo Alto. During the time of this study, he was a data scientist at Kaiser Permanente, Division of Research. He is now currently employed as a statistician at Unlearn.AI.  


Schuler A, O'Súilleabháin L, Rinetti-Vargas G, et al. Assessment of Value of Neighborhood Socioeconomic Status in Models That Use Electronic Health Record Data to Predict Health Care Use Rates and Mortality. JAMA Netw Open. 2020;3(10):e2017109. Published 2020 Oct 1. doi:10.1001/jamanetworkopen.2020.17109

Agree or disagree with an article? Share your professional thoughts on an article you read.

Your Name
5 + 10 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
Back to Top