Insurance Networks and Taxi Medallions

Carl Hall
HealthyHive
Published in
5 min readAug 17, 2016

The highest and best use of open data is for the public to benefit through an increase in the underlying living standard. It can come in the form of increased accountability leading to less waste, job creation, or in our specific use case, more efficient markets (and hopefully eventually job creation). Extracting value for mass consumption can be a long process requiring a creative approach.

When you have over 40 million healthcare claims you’d think we’d be able to spin together some compelling insights. It is not as straight forward as it should be. Certain data elements such as the identification of the insurance company are not included in the public release. It is hard to build price transparency tools if you cannot link claims to an insurer…

“Organic Data”

HealthyHive.com’s medical claims data comes from New Hampshire’s Comprehensive Health Information System. Insurance companies are required to send claims to a third party company to standardize it for submission to the state database. (Well, except for self-funded plans claims. See Gobeille post: https://sunlightfoundation.com/blog/2016/05/03/opengov-voices-will-a-recent-court-decision-jeopardize-open-data-in-health-care/)

We joke our data is “organic data” or better yet “farm to table data” because we just receive raw claims. And we like making fun of jerks who judge us for eating inorganic quarter pounders with cheese. When you struggle with such large and confusing datasets as we do, you actually develop a nervous laugh and think the “farm to table data” moniker is actually funny. (Don’t try this at home, kids.)

Where’s the Beef?

Insights are slowly emerging. A predictive chart for those wondering if they should elect a high deductible or fully-insured plan based on gender and age:

Crack the Code: open data + open source software = disruption

Fortunately the state provides a supplemental file that links insurance company reimbursement levels at the 25th percentile, median, and 75th percentile breakpoints for over 400 common procedures. This turns out to be pretty valuable.

As an example, we first match all claims to the insurance company when the reimbursement amounts match those in the supplemental file. Stated differently, we link claims that match the reimbursement rate at either the 25th, median, or 75th percentile to one of three of the three largest private health insurers in NH. It turns out 3 insurance companies more or less covers the waterfront in NH given a lack of insurer competition in the state.

After linking a portion of claims to payors the fun part begins. Being based in Boston means we have access to some of the best and brightest when it comes to expertise in machine learning algorithms and clustering approaches. So we leverage that nice resource to say the least. For content marketing we also utilize R software packages. R is a powerful, open-source statistical programming language that can be used to gather, manipulate and visualize data. The amazing thing about open source software is “wicked smart” people from around the world contribute all sorts of amazing software packages. As luck would have it, one of the programmers who wrote a machine learning algorithm for R is also in Boston. The author was even kind enough to offer his time to verify our application of the package.

The end result is we are able to approximate to the best of our ability the costs of over 400 common procedural healthcare services based on insurer and hospital. It turns out all this hard work can really highlight significant cost differences across hospitals at the plan level. And we are using free data and free software.

I guess some people say we are producing Smart Data. I have no idea what that particular buzz word means, but its equally as obnoxious as “farm to table data”, so let’s go with it. It sounds good too.

The truth is we stand on the shoulders of giants from the open source community in an attempt to create actionable insights for consumers looking to mitigate their misery when it comes to healthcare.

Three Yards and a Cloud of Dust

The famous football coach Woody Hayes coined the term “three yards and a cloud of dust” to describe a running game built on grinding down the field slowly but steadily. The term is so perfect to describe our experience in attempting to add value using open data. A running game in football also requires a lot of teamwork. By far, the most rewarding aspect of our journey is learning that we have teammates we never knew existed in the open data and open source communities. It is humbling to think how generous people can be.

Life in the start up world is not the glamorous passing game with a star quarterback and a super human receiver. It is a grind-it-out grueling & dusty running game. It takes YEARS and the occasional false start. But it’s the teammates you meet that make the experience worth it. Regardless of how our project pans out, life in the trenches can be quite rewarding.

What about Insurance Networks and Taxi Medallions?

Not long ago people paid a lot of money for taxi medallions. The monopoly never benefitted consumers. We had no other option. Taking taxis sucked. We now know how that panned out: A creative software app coupled with powerful network effects with a dash of capital has brought the taxi industry to its knees.

You know what else sucks? Not knowing how much your healthcare will cost before you receive care when you have a $3,000 deductible. It sucks even more when you find out after the fact that your $2,200 MRI could have cost you $800 three miles down the street.

Oh, and that $1,400 savings could have been invested in a health savings account, which if invested for ten years at a 5% rate of return would be worth $2,280. That $2,280 could finance a root canal once you hit retirement and enter the Medicare system where dental isn’t covered. That scenario does not suck. Some would call that progress. We call it financial wellness.

--

--