Price transparency rules and legislation has made it so humble start-ups like yours truly can aggregate over 2B rates. That’s great and all, but realistically the average healthcare organization only needs a slice of that 2B to get actionable information. And to get down to that actionable information, we need to correctly identify which outliers are adding value versus the pesky pixels adding to the noise.
To help our users narrow down what could be millions of rate records, we use an outlier detection model to sift through the data when pulling search results. How we define outliers is a hot topic at Turquoise because there is no widely-adopted standard. Curious about the three-year evolution of our outlier detection model? Walk with me.
Outliers in Price Transparency Data
Like any large dataset (and even some small ones), price transparency data has outliers. These outliers stem from typos, formatting errors, and inaccurate mapping or billing. Outside of the super obvious, judging what is and isn’t an outlier can be a messy business.
For example, the description of a J-code for a drug may indicate one unit; however, hospitals often have multiple lines within the charge data master (CDM), and each CDM line may have a different description and charge amount.
Cost differentials for the same item can leak into the non-outlier cohort, which skews the overall cost landscape. Healthcare has never had a barometer against which to weigh prices, which adds to the complexity of outlier assessment.
In response to those challenges, our previous outlier detection model took a conservative approach and only removed what would be the obvious-obvious outliers. When someone searched within our rate search engine, we automatically chopped off the 5th and 95th percentile. You live by the bell curve, you die by the bell curve.
Expanding the outlier model
Now we have a team of razor-sharp data scientists to move beyond the beloved bell curve. The first step we took was to establish a goal of outlier reduction that would be holistically applicable to both our provider and payer data sets.
We started by cutting the data via different columns and categories. When we looked at the distribution of published rates by cohort, we immediately saw compelling indicators of normal vs. outlier rates when we included geographical groupers in the model. The graph below illustrates the distribution of published rates for commercial, non-professional fees for CPT code 99283 in Kentucky. The orange bars represent rates that we define as outliers.
Looking between these two graphs, we can see that outlier calculation can squeeze both tails of charges.
At this point, the collective light bulb turned on for me and my team. “We’re really onto something!” they said, with their five extra monitors and dozens of open tabs humming along in the background. The next step was to run the process back to see if the same thing happened in the same cohort using the data from a different state and code.
Looking at the commercial, non-professional fees for the same CPT in Ohio, we can see an instance of a state with high-ish charges for this CPT. Including the state in the outlier calculation releases some higher charges from outlier-hood.
Here’s the list price, non-professional for Michigan (same CPT). As you can see, this shows a state with somewhat lower bounds for its highest charges for this CPT. Including the state in the outlier calculation inducts some new charges on the high end into outlier-hood.
Officially adding location-based groupings in addition to cohorts
After convening with the team, we agreed that location-based groupings would be the next best evolution of the model. As a result, our Rate Sense users can include/exclude outliers in their search results, allowing for the most comprehensive picture of cost.
As price transparency data matures and oversight and enforcement ramp up, the ability to detect outliers bolsters the validity and accuracy of the data. The more eyes there are looking in the weeds, the more capable the entire industry is to iterate and improve.
How do you think we should continue to iterate on this model? We’re all ears.