Practical Methodology for Deep (and Machine)Learning Practitioners

Nicolás Ugrinovic
2 min readSep 7, 2019

--

So I guess you are working on or want to start doing some deep learning. This is more of a summary of chapter 11 of “Deep Learning” from Ian Goodfellow et.al., a book that really sediments most of deep learning basis. This is of great use for me as a sort of note taking. However, I hope it can also be useful to you. Let’s start.

According to this chapter, a good machine learning (or deep learning, used interchangeably here) practitioner, aside from knowing the existing algorithms and understanding of machine learning (ML) principles, needs two things:

  1. know how to choose an algorithm appropriate to the task at hand
  2. know how to monitor and respond to feedback obtained from experiments in order to improve the system.

With this in mind, the important decisions one has to make in the development of ML systems revolve around, whether to:

  • gather more data or not
  • increase or decrease model capacity
  • add or remove regularization features
  • improve the optimization of the model
  • improve approximate inference in a model
  • debug the software implementation of the model

I find that in practice these considerations are truly the ones that matter. If one is able to determine the right course of action through a correct monitoring of the system in contrast to blindly guess what to do, then he/she will have saved a lot of time while improving the system.

Out of these 6 considerations presented above, the first one that should be considered right from the beginning so one can go from idea to training is to have a correct software implementation of the model and the system in general. This not only means that you know how to program correctly so that the code can be compiled and run, but also that you do not make conceptual mistakes. You should be aware of things like making a correct the normalization of the input to the system, this means knowing which dimensions should be normalized and what type of normalization you will use.

This is a very short post, but I plan to continue creating a small series of these posts to later have a more comprehensive notes on the matter. If you agree or disagree with these points, feel free to comment. I would love to discuss on these issues and know your point of view about them!

--

--