I think the challenge here is the distinction between important and valuable. I spend a huge amount of my time building complex predictive systems to help the largest marketers drive increased sales through real-time advertising. Part of the challenge is that to get statistical significance we tend to dumb down the way we look at a particular site. It’s not “the thoughtful investigative article on the NY Times” it’s “nytimes.com/business” (though of course, we can layer on many other signals on top of that). It’s not realistic to train a predictive model on each story, partially because of the real-time nature of traffic to stories when sales tend to lag by hours or days, so we need to find a new currency. I suspect, though I have not tested, that the type of article would drive some performance lift… but I’m willing to bet that word count is equally if not more predictive.
In other words, my fear is that if a well-researched investigative piece is important — and thus valuable to the world — but not highly correlated to marketer outcomes, it won’t “work” for marketers. I hope I am wrong; if not, there is always branding money out there. If you’re interested in a large-scale machine learning ad platform to play with various quality metrics, it would be very fun to see what correlates with outcomes!