Only one out of every four Americans knows that thanks to technology, we’re producing as a country far more with less. Most people don’t know that, or blame things like immigrants or offshoring for job losses, even though offshoring is only possible due to technology improvements and only accounts for 13% of manufacturing job loss. That’s a problem. We can’t make the changes we need to make if people aren’t aware the problem exists, or think the existence of the problem is something to be debated. We can’t agree on solutions like unconditionally guaranteeing everyone a basic income as a rightful productivity dividend if people are actively being unemployed by growing productivity and the discussion is framed as a future danger to our social fabric instead of a clear and present danger.
Data cleaning and data wrangling, as the first step doing any of this stuff, is a giant part of this field. There’s almost never not errors in your data. It would be really hard to not have any misleading things. Sometimes there are systematic errors or systematic biases; [for example] all your users were from one demographic, and now you’ve learned something, but it’s not applicable to other demographics. Just knowing what you’re going for and looking at your data in a principled way and seeing if it is going to be able to predict that without flaws and biases is obviously a big problem.
They definitely have large advantages. It may also depend on the problem you’re tackling. I can imagine collecting a bunch of data in a specific domain that people aren’t looking at right now and having an advantage in that domain. If you’re trying to do a lot of standard, traditional things that people are interested in doing right now with a kind of internet user data, the guys with all the internet users are gonna win that game, at least for the time being. But there will always be a new company that will be a threat to these guys. It’s the cycle of companies becoming dominant in some field, and then getting challenged by newer companies.