Originally posted at https://blog.cloudera.com/blog/2017/07/prophecy-fulfilled-keras-and-cloudera-data-science-workbench/, now since unpublished.

Alas! what a great loss there will be to learning
before the cycle of the Moon is completed.

The Prophecies of Nostradamus”, Century I, 62

In 1555, did Nostradamus predict deep learning built on stochastic optimization using a loss function’s gradient? Almost certainly not, but, what can deep learning predict about Nostradamus in 2017? Let us begin by assuming that’s an interesting question.

The Rise of Deep Learning

Deep learning has evolved so rapidly in the last five years that it’s hard to keep up. It’s being applied to predict fraud or illness, to play games, and even…


Can using simple statistical techniques in combination with big data help solve the Tamam Shud mystery?

Reprinted from https://blog.cloudera.com/blog/2016/09/solving-real-life-mysteries-with-big-data-and-apache-spark/, now since unpublished.

Everyone loves a good real-life mystery. That’s why the three most popular TV shows of the 80s and 90s were Jack Palance’s reboot of Ripley’s Believe It or Not!, Unsolved Mysteries with Robert Stack, and Beyond Belief: Fact or Fiction hosted by Commander Riker. (Well…they were in my house, anyway.) At Cloudera, the highly-skilled support team has gotten good at cracking actual stranger-than-fiction cases like, “Why doesn’t this Kerberos ticket renew?” or, “Who deleted that table?”

In this…


Reprinted from https://blog.cloudera.com/blog/2015/12/common-probability-distributions-the-data-scientists-crib-sheet/

Data scientists have hundreds of probability distributions from which to choose. Where to start?

Data science, whatever it may be, remains a big deal. “A data scientist is better at statistics than any software engineer,” you may overhear a pundit say, at your local tech get-togethers and hackathons. The applied mathematicians have their revenge, because statistics hasn’t been this talked-about since the roaring 20s. They have their own legitimizing Venn diagram of which people don’t make fun. Suddenly it’s you, the engineer, left out of the chat about confidence intervals instead of tutting at the analysts who…


The “10-page anti-diversity screed from some guy at Google” practically begs everyone to dash off a hasty, long-winded rebuttal. I worked at Google over a decade a go, and it triggered some long-standing angst I had about the company’s diversity, but, not quite the ones I’m reading about.

This Has “SWE 3” Written All Over It

The document does deserve to be criticized. It wasn’t an off-hand comment; this person believes this is a polished exposition. Yet for something intended to be didactic, it’s repetitive and equivocating. It advances points, then hedges about whether they’re even true. For example, the digression about biological differences seems flawed. Can our…


As a B-list celebrity data scientist, and skeptic of the underspecified, overhyped “Data Science” movement, I was so glad to find David Donoho’s critical take in 50 Years of Data Science, which has made its way around the Internet. Read it now. I suppose it should really be called 53 years of Data Science, but 50 is a popular number of things to have something of.

Image for post
Image for post

This paper narrates a strong Statistics-based take on Data Science, one which rightly punctures much of the puffery around this term and “Big Data.” Ultimately, it proposes its own better Statistics-based take. The smack-down…


“Hey, do you know any great tech people looking for an opportunity? I’m hiring.”

Like junkies shaking down their Rolodexes for leads on more skag of a Friday night, anyone running a team or company in tech seems to be endlessly asking each other this.

“All the good folks I know are happily employed, sorry!”

… we all tell each other and then promise to share any leads and shuffle on. Always Be Hiring! Why does it seem so desperately difficult?

Image for post
Image for post
My kingdom — or at least, a small percentage of equity — for a unicorn!

I sat down and collected some observations, mantras and received wisdom I keep hearing about working in tech, and…

Sean Owen

Big-data data science personality @ Databricks. Prev: Director Data Science @ Cloudera

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store