We’re coming up on ten months since Transparency in Coverage (TiC) took effect and required insurance companies to reveal the rates they pay doctors and hospitals for all medical and surgical services. These prices must be posted online in a defined schema as Machine-Readable Files (or MRFs, for short) and made available to everyone.
This is the second time that we’ve been through this learning process. Back in January 2021, hospitals were required to post their pricing information, including list prices, cash prices, and contracted reimbursement rates from insurance companies.
Turquoise Health collects both hospital and payer transparency data under one roof. Our engineers and product team work each month to wrangle data from payers and hospitals into a format non-experts can work with. This means we can compare both data sources to learn more about how healthcare pricing works.
One of the most common questions we get asked is this:
“If the payer file says a service costs $100, what does the hospital file say about the same service?”
This sounds like a simple question, but it gets at the heart of the complexity of the healthcare system and how services are priced.
Let me take you on my most recent journey into the nooks and crannies of transparency data and show you how we can answer this question. I’ll even give you some free data!
Comparing Apples to Apples
Before we can compare prices reported separately by hospitals and payers, there are some data-wrangling challenges that we have to solve:
- Linking up providers to their prices in the payer data using their ID numbers
- Identifying which insurance plans belong to the same insurance network
- Assessing data quality and removing bad data and outliers
We already do this work at Turquoise, so our users get to skip these steps. Feel free to jump down to the next section if you just want to see some data. But these steps are worth a quick explanation because they are common stumbling blocks when trying to work with price transparency data.
First, the payer data only identifies a provider by a tax ID number or NPI. It doesn’t include basic information about providers like name, location, type of care provided, etc. So the first step is to link up the raw data from each payer to a database of providers. This makes it possible to query the data by location, type of care provided, etc.
Second, the pricing data reported by the payers is extremely granular. There are separate prices for every single plan offered by each insurance company. Instead of “Aetna HMO rate,” you’ll see specific insurance plans like “A&A Auto Body Company HMO Plan #4345”. By condensing the data back down to higher-level network prices, it makes the data much easier to work with.
On the other hand, hospitals typically report rates at a high level, like “Aetna Commercial HMO.” It takes extra work to map these names to their underlying insurance plans. At Turquoise, we use a mix of machine learning models and rule-based evaluation to identify each insurance product.
Third, we combine data collected from thousands of hospitals and hundreds of separate insurance companies. Mistakes happen, so it’s crucial to check each file, correct errors where possible, and score the quality of each file so that we know which data is trustworthy. This lets us query the data with confidence.
This is just a quick overview of the type of clean-up work that is required to compare data across data sets. We hope to see things get easier over time as CMS updates the price transparency requirements in the future.
Finally, Into the Data!
Let’s compare hospital-reported and payer-report prices! I’ll start by looking at inpatient stays and comparing prices reported by hospitals with prices reported by insurance companies for the same stays.
At Turquoise, both data sets are available in a giant database, so I just need to write SQL queries to pull samples and see where the prices agree. However, one final pricing caveat is critical to mention before we compare any numbers.
Healthcare contracts use a mix of several different pricing strategies to set prices:
- A fixed dollar amount for each service.
- A price defined as a percentage of the hospital’s list price.
- An all-inclusive price based on the type of service provided. These are most common for inpatient stays.
- A fixed per diem amount paid for each day a patient stays in the hospital.
In all of these cases, hospitals typically only report a representative dollar price without additional information to contextualize how the amount was calculated. This may change in the future as CMS decides whether to mandate the new file standard they published in November 2022. Payers report the underlying details of how each rate is set. This means that a hospital may report “$328” and a payer may report “37% of list price,” which may be in total agreement but challenging to compare.
We’ve built our solutions to address these issues. But with these considerations in mind, I thought it best to keep my examples simple. The simplest insurance contracts pay inpatient MS-DRGs at an all-inclusive flat rate and inpatient room and board revenue codes at a per diem rate. So let’s start by comparing those types of rates.
How to Tell When the Data Matches
Hospitals and payers report data on a different schedule (yearly vs. monthly), so sometimes the latest prices reported by one will be slightly different than the other. Because of this, I’ve considered a pair of rates with <0.5% variance between payer and hospital MRFs to be a match.
Also, a hospital may provide more services than an insurance plan covers, and vice versa. To keep things simple, I’ve considered a hospital with matching rates across more than 600 MS-DRG codes in both datasets to be a full rate schedule match.
Free Data!
Let’s see the results. Below is a selection of the data where I found full parity between hospital and payer MRFs. Here's a snippet from Baylor Scott & White Medical Center, Brenham for Aetna Commercial Products:

But the parity party doesn't stop there. View the full list of CSVs here:
Utilizing the same approach and variance tolerance, here are similar examples with per diem rates for St. Joseph Medical Center's Aetna contracts:

Rate parity between hospital and payer files doesn't just stop with Aetna in TX. BCBS of TX is a great place to explore next, as they've posted relatively complete files across the state. Have a look at additional per diem CSVs below:
How the Industry Can Use This Data
Ok, but so what? The hospital and payer rates should match, shouldn’t they?
Well yes, they should match. But differences between the schema of the two data sources and the complexity of the underlying contracts can make comparisons more difficult than they should be.
But these matches are a momentous finding for folks negotiating Managed Care contracts nationwide. Where the payer data and hospital data corroborate one another, mutual trust can be cultivated between insurers and providers to negotiate sensible rates for their agreements.
Nothing in Life is Perfect
Let’s not forget about where the rates don’t align. Not everything is a dead-on match. However, where there are differences between the data sources, we can learn as much as where the data sources are perfectly aligned.
Payer/provider-negotiated contracts commonly include an escalator clause that automatically changes reimbursement by a specific percentage across all codes for a specific care setting after a certain date. Examining cases where the data did not align revealed evidence of these escalator clauses.
I changed the variance tolerance from 0.5% to 7% and looked for cases where 600+ MSDRGs had an identical percentage difference across all rates. In the data below, you can see these matches which are examples of probable escalator clauses in managed care agreements:
What’s on the Horizon
We can learn a lot by comparing payer and hospital rates. There’s a whole lot more to learn from this data. What I covered here scratches the surface. Over the coming weeks, I’ll be diving into more. Let us know at info@turquoise.health what you want to see and we’ll hop in!