Data Science newsletter 2017–10

Published in

Compendium

4 min readJun 11, 2018

The Data Science community group in Computas work on monthly newsletters that we will publish continously on this blog. This post will present the newsletter for October, where we have divided the content into three different sections: «Getting started», «Beginner Tutorials», and «Advanced». We hope you enjoy it!

Getting started

This section includes links to articles where one gets an overview of machine learning. No code, no math, just plain english.

Who Will Command The Robot Armies?

A good blog post about accountability when making intelligent systems.

So who will command the robot armies?

Is it the army? The police?

Nefarious hackers? Google, or Amazon?

Some tired coder who just can’t be bothered?

Facebook, or Twitter?

Brands?

New Theory Cracks Open the Black Box of Deep Learning

A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

A must read article, shared widely among artificial-intelligence researchers, which possibly offers an explanation of why neural networks generalize so well.

Beginner Tutorials

This section includes links to tutorials you can follow up. Some have code you can follow along.

TensorFlow Playground

Introduction to neural networks with an Interactive visual playground that helps to develop an intuitive understanding of what neural networks are all about.

neural network
visualizations available
interactive

8 machine learning 8 minutes

The goal of this article is not to simply reflect on the popularity of machine learning. It is rather to explain and implement relevant machine learning algorithms in a clear and concise way. If I am successful, then you will walk away with a better understanding of the algorithms or, at the very least, some code to get you started when you try them out for yourself.

code available
videos
nice explanations

Advanced

This section includes links to resources where you have to make a bigger effort. But it pays off.

TensorFlow

https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/#0

With slides: https://docs.google.com/presentation/d/1TVixw6ItiZ8igjp6U17tcgoFrLSaHWQmMOwjlgQY9co/pub?slide=id.p

In this codelab, you will learn how to build and train a neural network that recognises handwritten digits. Along the way, as you enhance your neural network to achieve 99% accuracy, you will also discover the tools of the trade that deep learning professionals use to train their models efficiently.

This codelab uses the MNIST dataset, a collection of 60,000 labeled digits that has kept generations of PhDs busy for almost two decades. You will solve the problem with less than 100 lines of Python / TensorFlow code.

What you’ll learn

What is a neural network and how to train it
How to build a basic 1-layer neural network using TensorFlow
How to add more layers
Training tips and tricks: overfitting, dropout, learning rate decay…
How to troubleshoot deep neural networks
How to build convolutional networks

What you’ll need

Python 2 or 3 (Python 3 recommended)
TensorFlow
Matplotlib (Python visualisation library)

Installation instructions are given in the next step of the lab.

The Unreasonable Effectiveness of Recurrent Neural Networks

There’s something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience I’ve in fact reached the opposite conclusion). Fast forward about a year: I’m training RNNs all the time and I’ve witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me. This post is about sharing some of that magic with you. We’ll train RNNs to generate text character by character and ponder the question “how is that even possible?

By the way, together with this post I am also releasing code on Github that allows you to train character-level language models based on multi-layer LSTMs. You give it a large chunk of text and it will learn to generate text like it one character at a time. You can also use it to reproduce my experiments below.

Great blog post
Nice explanations
Code available to play