Things to learn for developing and researching ML

David Mack
Jul 10, 2018 · 5 min read

Ashwath Salimath, our summer research fellow at Octavian, asked today what he should focus on to consolidate his machine learning skills:

I want to develop mastery of core ML algorithms in TensorFlow, and to be able to quickly convert research papers into well-written code. What should I do over the next 3–6 months?

This prompted an interesting discussion, the summary of which I hope is useful to others here.

To become solid at developing (and potentially then expanding to researching) machine learning algorithms, I’d recommend spending time doing the following:

Write a diverse range of models from scratch

Furthermore, debugging a model forces you to understand what it is (/is not) doing, why it’s doing that, what its limitations are and how to resolve common problems.

It’d recommend picking a non-trivial data problem (e.g. not MNIST or Iris) so that you run into more real-world relevant challenges (for example, class imbalance, noise, intractability, different accuracy metrics, data-cleanup and pre-processing).

If you really want to learn (and experience some pain!) pick problems that lack a tutorial/public solution. These have no easy short-cuts and will push your abilities.

So that you’re flexible and can employ a wide range of tactics, I’d suggest getting familiar with all the major ML architectures:

  • Dense layers / Regression
  • Convolutional neural networks
  • Recurring neural networks
  • Reinforcement neural networks
  • Embeddings (e.g. collaborative filtering, search)
  • Bonus: Neural turing machines

Build modular, testable, assertive/typed code

By writing more bullet-proof code you will be able to write working models faster.

You want to make code that is friendly to others, likely to work and will give an easy-to-understand error instead of a pesky zero percent accuracy.

  1. Build your models out of many small functions that tell a story
  2. Include as much static and runtime checking as possible (e.g. include assertions that tensors are the shape you think they will be, assert masks are indeed the right format)
  3. Include unit tests of sub-modules (e.g. does your memory read module correctly retrieve values? does the language tokenization reliably encode-decode values?)
  4. Use well-tested library functions instead of rolling your own when possible (e.g. explore your library’s utilities!)

Try multiple tools and platforms

  • Try out all the different tabs of TensorBoard (the projection tab is really handy when computing embeddings! Try generating your own label dictionary for it)
  • Try training in the cloud (e.g. with FloydHub, Cloud ML, SageMaker)
  • Try a different machine learning library (e.g. PyTorch vs TensorFlow, check out Keras’s Model class)

Read/implement ideas from research papers

  • Get good at reading maths and cutting edge computer science
  • Hear about new ideas
  • Find inspiration for projects

Twitter is currently a good place to discover new papers. Here are some ideas of handles to follow.

Next, try implementing things from papers. Even if implementing a whole research system is daunting, there are smaller ideas you can employ in your work. For example, I was struggling to find a learning rate that would successfully train an embedding model, and a 5 line implementation of PercentDelta solved my issue.

Build a distributed training system

Whilst building an enterprise scale distributed training system is a large endeavor, more friendly sized projects are easily possible:

  • Try using a distributed queue for training/prediction data (e.g. Kafka, Rabbit MQ, Firebase)
  • Try using multiple computers/instances for training (e.g. a Kubernetes cluster, AWS instances, your friends’ laptops)

Personally, I got into building distributed training to help with a genetic algorithms/neural turing machine experiment.

Run a model on a limited device

Try one of the following:

  • Get a model you trained predicting on a phone
  • Get a model you trained predicting in a browser
  • Make it possible to train a model on a phone/old computer

Write about the things you learn

Here are a few common formats you can try:

  • A lab report of what you did, what the results were
  • A short technical QA (e.g. how to solve a common pitfall, get around a bug)
  • A tutorial
  • An explanation of a complex concept
  • A presentation of a new finding

Practice finding interesting problems

  • Novel (has not already been solved/done)
  • Interesting to others
  • Possible with the time, skills and resources you have

It can be hard to find things that fulfill all of those criteria (often increasing your skills and resources is very helpful). But with each project, you can reflect on how it went and then hone your skill.

Finally…

Octavian

Research into machine learning and reasoning