Politics as a Data Problem
Mike Konczal’s recent post on the GOP healthcare bill debacle ends with an interesting observation:
How did Republicans end up in this position where ideas that should function as a railing and guide end up speaking to nobody? McKay Coppins wrote that recent changes have led to “a caucus full of conservatives with excellent ratings from the Heritage Foundation, and no idea how to whip a vote” in Congress. The DC conservative policy apparatus has followed a similar path. It have also become accountable only to itself, ideological donors, polarizing media, and a race against their own extreme instincts. It’s the dynamic David Frum diagnosed in his classic Waterloo essay, but among the intellectual class as well.
What exactly does it mean to “whip” votes anyways? Historically, this has meant two distinct but connected activities: you count the votes and you try to add to your totals by negotiation. The first part has not been forgotten: indeed, people are better than ever at counting the votes, precisely because how people vote in Congress has become more predictable than ever. The second part, however, has become increasingly a lost art.
To appreciate this, we need to step back from politics. Why do we negotiate and bargain, in any setting, instead of just pay the price and buy the thing without the hassle? You can pay the price if the value is obvious and predictable. In other words, the ease of transactions comes with commodification, and commodification is the result of data analytics, broadly speaking. Something is a “commodity” if its value is clearly defined and predictable, without much noise. This is, of course, the consequence of the data generation process, say, the Industrial Revolution. Mass production, coupled with increasingly improving “quality control,” (again, defined broadly) meant that the products being transacted were increasingly identical. As the famous saying has it, all Ford model T’s were identical, all the way to their all being black. As I’d heard from several acquaintances, albeit in slightly different terms, all data not pertaining to the goal is noise, and all noise is waste. (I’m not referring to anyone specific as I ran into this worldview often enough, although hearing this expressed bluntly earlier today has kept the idea fresh on my mind, I will admit.) Of course, with so little uncertainty, the “correct” price is easily defined and there is no need to negotiate.
One problem is that this worldview also shapes the demand for the data. The complex, unpredictable, and noisy data and the process associated with their creation are shunned. The analysts are compensated for how well they can predict. Why waste time on noisy and complex business that generates little profit? Why waste time bargaining over items whose value is completely obvious and predictable? Or, in context of politics, why negotiate and bargain over votes, when you already know that they will be voting party and/or ideology, and you know what currency “matters” for them?
Note that this has a multiplier effect that goes beyond just bargaining over individual items, but also the process that generates data in the first place. Members of Congress “know” that most voters vote party and, to a lesser extent, ideology and creating an alternate channel of information that “beats” their party/ideological labels does not carry a great deal of return on investment. When faced with other demands on their time (and periodic redistricting), they spend less time developing reputations other than partisanship and ideology. But without electoral assets besides partisanship and ideology, they have nothing to negotiate with. Their votes are easy, clearly defined, and commodified. The first dimension of “whipping” continues apace, greatly eased by the lack of need even to talk to the members of Congress whose votes are being bargained over, and, in this case, the whole thing was a no go from the beginning: either ideology or partisanship (notice the word “either”) was enough to prevent Republican votes from becoming high variance.
The point here is not about “politics,” but about data environment. If we want to define the value of data in terms of their predictability and treat noise as an enemy and nuisance, we will be systematically selecting the sample in favor of predictable data, biasing the market, regardless of what the reality looks like. We wind up with the view of reality that is much more orderly than it really is, thinking that we can predict things that may be more complex than we’d think. The economics of information in markets — so-called wisdom of crowds — is rich in this history. The idea that the markets, i.e. the “crowds,” are wiser than any individual is an old one, and, for the most part, accurate. But as Grossman and Stiglitz famously pointed out, in 1970s, no less (and Keynes alluded to in 1930s!), this leads to paradox because prices and other market-wide information is cheap and easy to process, while learning something on your own is not. Simply following the market always beats out, in cost-benefit calculations, doing your own homework. Furthermore, with everyone doing the same thing, the information provided by the market is far less noisy than the reality: If people with their own private sources of information, which would be distributed independent of the market trends, contributed to how the market decides, they would introduce appropriate noise, with each noise representing a nugget of information both right and wrong, but, if the market is full of people recycling the “right” information, the information of the market, then we have an echo chamber that, if the information is wrong, would completely collapse at once. Shunning the complex and unrewarding information in favor of easy and obvious information is slightly different from this, but conceptually analogous: commodified information based on “everyone knowing everything’s value, easily” principle runs the risk of collapse when the underlying assumptions are no longer applicable.
This is, in a sense, an opportunity, for both honest dealers and hucksters alike, but with attendant dangers. Overvaluing of the risk in financial markets opened up the market for junk bonds, a la Milken. Then the risk got undervalued, and bad things followed. Overvaluation of certain fancy baseball stats led to premium on low contact, high power hitters, until it wasn’t the case any more (Mo Vaughn vs. Chris Carter?). The problem is that a demand for commodification exists (i.e. low noise, “predictable” information), data people are too quick to meet this demand, too quick for the underlying reality, and creates bubbles, in political, economic, and other realms. Recognizing the limits, that the noise is an inherent part of the data and not simply a nuisance to be wiped out, should help counter this, but serious statistics (or even epistemology) of noise, rather than all manner of techniques for dealing cleverly with noise, on the way to finding something predictable (which most statistics that purport to deal with noise really turn out to be) is lacking. This is something that deserves deeper thinking.