This Week in Data — Google Brain, Apache Beam, Neural Nets in the browser

Kick Off

Machine learning — the training of computers to act like brains — is increasingly being used for things outside of computing. There are few people closer to the center of this shift than Jeff Dean, the head of the Google Brain research group. MIT’s Technology Review has a good interview with Dean that is worth reading.

In the News

Microsoft’s cloud business fell short of expectations this week in the company’s earnings. As it stands, it’s pretty hard to beat Amazon in the cloud right now. But Microsoft and Google are both trying. For the first time ever, I’m seeing some of my clients pick Microsoft’s cloud. So it’s a battle to watch.


There was a big conference related to Hadoop, a (now) well-established data-processing technology. This summary of it is a good write up of the competitive landscape within big data companies, particularly those that are based in part on working with Hadoop. It’s a bit inside baseball, but if you want to come in the dugout, read here.


This week, Google announced a new open-sourced product called Apache Beam. Google says it is a “single programming and runtime model that not only unifies development for batch, interactive, and streaming workflows, but also provides a single model for both cloud and on-premise development,” as this Datanami article puts it. I was excited to read about this. They have a nice writeup showing how Beam has a better model for processing real-time event data compared to Spark, another widely-used data processing technology. It is exciting stuff, and I’m looking into it to see if it would be a good backend for Ufora to run on.

In Industry

Technology start-ups — often ones focusing on data — have become so important in finance, that these sorts of companies are known as fintech. So it’s not surprising that banking regulators are starting to set their sights on rules around tech companies in this sector. Institutional Investor has a good story on various regulators writing up rules for tech.

Quirky Corner

For all the geeks among us, this is an awesome new program where you can play with neural networks in a browser, and get a really good sense of how they work. I spent an entire afternoon tinkering with it.


I also found this new open source tool to visualize data worth knowing about.


And this week it’s worth noting that there was a good amount of debate going on in the data science industry about the scarcity of women in data science. This bubbled up in part because of a controversial remarks by the creator of the Microsoft DirectX technology platform. Then, in a dramatic twist, that fellow’s 22-year-old daughter who works in tech went public in Wired to refute his remarks. So much drama. In all seriousness, it’s so important to look for more ways to bring women to the table in tech.

What’s happening at Ufora

We had another great week coding for clients and working on the Ufora core. I’m almost done with a new scheduler that incorporates a predictive model for the runtime of individual calculations. This is work I’ve been dreaming about for a couple of years now, so it’s exciting that it’s close to fruition! In other news, my colleague Ronen gave our statistical QA platform TestLooper a facelift, and we put a new customer’s codebase on it.

Thanks for reading!

If you’d like to receive this in your inbox every week, subscribe here.