Let me guess—you’re an actuary who loves a clean claim file. So was I.

Give me historical utilization, final dollar amounts, and a few well-behaved CSVs, and I’m a happy camper. But then came healthcare price transparency data, and suddenly, I was staring at something that looked more like raw code than cost data. And I thought: Is this worth the trouble?

Spoiler alert: It is. You just need to flip your mental model.

Claims Data vs. Price Transparency Data: Flipping Your Mindset

In claims data, you get the answer—how much something actually cost. With publicly available price transparency data, you get the formula. For example, instead of seeing “$6,000 for inpatient care,” you’ll see something like per diem reimbursement, code-level pricing, and modifiers or place-of-service indicators. It’s not plug-and-play. But once you learn how to decode it, you unlock a view of negotiated rates by carrier, provider, and service that claims data simply can’t give you.

What’s Actually Inside an MRF?

Machine-readable files (MRFs), the file type with which all raw price transparency data is published, are loaded with information—some of it familiar, much of it not. MRFs often say things like “125% of charges.” But, as an actuary, you want dollars, not managed care contract terms. Place of service, billing code modifiers, and provider types all affect reimbursement, but most you’ll likely never have been trained to work with them. You’ll also encounter both organizational and individual National Provider Identifiers (NPIs)—or sometimes an Employer Identification Number (EIN). Matching price transparency data to claims? Good luck—unless someone’s already done the detective work.

And just to keep things interesting, price transparency data comes from two very different regulatory-mandated sources. Carrier data shows negotiated rates, often with sparse actuarial context. Hospital data includes billed charges and list prices but uses a completely different schema. Reconciling the two is like translating two dialects of the same language without a shared dictionary.

So, why use price transparency data at all? Because price transparency data lets you see forward. Claims data is backward-looking—it shows what’s been paid. Price transparency data shows what has been agreed upon between carriers and providers down to the service level. It’s granular, predictive, and lets you compare real rates across regions, categories, and networks. If your job involves projecting costs, benchmarking networks, or informing negotiations, this is your new regulatory-backed power tool.

Technical Steps To Making Price Transparency Usable

Raw price transparency data is messy. Making it usable requires a few essential steps. First, you need to standardize hospital and carrier data into a single structure—ideally one price per code, per provider, per carrier. Without this, you're stuck doing manual reconciliation across inconsistent schemas.

Next, layering in Medicare pricing can provide a valuable benchmark. But calculating Medicare rates isn’t as simple as pulling a national average. It requires understanding regional adjustments, provider types, and the nuances of how each code is reimbursed. With the right context, you can start to make comparisons like, “This service is priced at 150% of Medicare in this region,” with confidence.

There are also additional enhancements that make price transparency data more usable:

  • Facility vs. non-facility mapping helps identify where care is being delivered—whether at a hospital, ambulatory surgery center, or provider group.
  • Clinical categorization allows codes to be grouped into logical medical areas (e.g., oncology, cardiology, musculoskeletal), which doesn’t come standard in MRFs.
  • Billing type mapping distinguishes between inpatient, outpatient, and professional claims—key distinctions that aren’t included by default but are often necessary for accurate analysis.

Once these enhancements are in place, the next step is to address the inevitable imperfections that come with real-world data—like outliers, missing values, and inconsistent entries.

Technical steps to making price transparency data usable

Amending the Mess: Outliers, Imputations, and Confidence

No dataset is perfect. Some values are missing, whereas some look like typos and others are just weird. To amend the data I suggest you:

  • Use contextual rules to flag and filter outliers rather than applying a blanket threshold
  • Fill in gaps with a stepwise approach to imputation, starting with the most relevant matches before expanding to broader comparisons
  • Apply confidence scores to help quantify uncertainty, especially when decisions hinge on the quality of individual data points

You're almost there. At this stage, the data starts to resemble the kind of resource actuaries can truly rely on—structured, consistent, and referenceable. But to unlock its full analytical value, you need to go one step further and bring in the contextual cues that transform numbers into meaningful comparisons.

And finally, adding context back into the mix

Once you've cleaned the data and handled the messier elements, it’s time to round out the dataset with the context that makes it truly usable. MRFs don’t include essential classification details like clinical categories, billing types, or facility indicators, yet these distinctions are critical if you want to answer the kinds of questions actuaries are asked every day. Whether you're comparing inpatient versus outpatient care, or grouping services into categories like oncology or orthopedics, you'll need reference data to map raw codes into meaningful labels. That also includes aligning billing codes with accurate provider and payer identifiers. Mismatches in NPIs, EINs, or the use of organizational vs. individual identifiers can easily throw off your results. On top of that, knowing which fields in the MRFs represent negotiated rates versus billed charges is essential for any kind of valid, comparative analysis. When that context is in place, price transparency data becomes a strategic asset.

How to get started: just ask a question

You don’t need to master the whole dataset on day one. The best way to begin is to ask a simple, straightforward question you already care about, and go from there. How do three carriers compare for childbirth costs in Dallas? What’s the price range for hip replacements across ZIP codes? Which provider group in LA has the best rates for oncology services? Run a few queries. Compare a few codes. Look at carrier variation. You’ll start seeing patterns—and opportunities.

Here are a few example prompts I find as the best starting points:

  • Compare 3 - 4 carriers in a specific market (e.g., Dallas).
  • Analyze the cost of delivering a baby within a given region.
  • Benchmark carrier prices for a particular service category.

But don’t stop there. Once you get a feel for the data, you can use it to answer bigger, more strategic questions:

Understand differences in network pricing

  • Should I offer a regional plan in certain geographies because the local carrier has a stronger network?
  • Does my current network fit the needs of my population?
  • Can I save money by switching carriers or narrowing my networks?
  • Which carrier performs the best in my market?
  • What is the impact of implementing a narrow network plan?

Explore direct contracting with providers

  • Who are the high-volume providers in my claims history?
  • What am I paying now vs. what other networks pay?
  • Is there a point solution that can address high cost care categories where my current network isn’t competitive?

Steer employees toward lower-cost care

  • Which providers are advantageously priced for navigation vendors to steer care towards?
  • Which cost-effective sites of care should clinics and care managers be directing their patients to?
  • How should employers design member communication campaigns based on actual price variation?
  • Which tiered benefit structures are the most cost effective?

Price transparency data isn’t perfect, but it doesn’t need to be in order to be useful. With the right foundation—clean identifiers, reference mappings, and contextual enhancements, it becomes a powerful tool for actuaries looking to drive smarter decisions. Whether you're comparing networks, modeling plan design, or looking for opportunities to reduce spend, this data offers a forward-looking lens into negotiated rates and provider behavior. What matters most is taking the time to understand the data’s structure, limitations, and potential.

Want some extra brain power to come up with strategic prompts? Looking for some extra help diving into this data? Give us a shout!