Why choose MXNet for Deep Learning ?
This article will not be a full comparison of Deep Learning frameworks but more a list of what made us choose MXNet over another framework. A full comparison between Deep Learning frameworks will not be accurate for different reasons.
First, we haven’t tested all existing frameworks. Second, they evolve really fast so the reasons we chose MXNet one year ago may not be decisive now.
Frameworks are in competition and good point form one framework is most of the time copied or adapted by others. If you want to add information or correct some of it, feel free to add a comment.
What exactly is MXNet?
On the official website, we can find a really small description of MXNet features.
This overview gives some of the main MXNet technical properties but it is not comprehensive. And it may not be enough for you either. These properties cover aspects from beginner to advanced usage. If you have experience in Deep Learning these can be useless as they are not as important for you.
How to install your Deep Learning framework?
One of the first obstacles you will have to overcome before using a Deep Learning framework is the installation. Some are easier to install than others and some have sparse or missing documentation. A year ago, the installation process was awful. But since then, most of the frameworks have simplified their process.
For example, frameworks with Python API now support installation via pip, the package manager for python. Here is a small list of frameworks that have installation documentation.
- MXNet: pip package or compilation from source with make.
- Tensorflow: pip package or compilation from source with Bazel. Bazel works fine but it’s not an user-friendly solution and not very well documented. In our experiments, we try to write code to use tensorflow and opencv in C++. Import Opencv in the bazel or import Tensorflow with cmake is horrible and a loss of time. Now there is not a recommended support for cmake maybe that can make the integration of tensorflow with other libraries easier.
- CNTK: pip package or compilation from source (we don’t test from source)
- Caffe: The compilation from source installation was really not clear on what to do and on dependency. We got a lot of errors before getting a working solution. Now, it seems they rewrote the installation page and it seems clearer.
- Chainer: pip package or compilation from source with python setup.
- Keras: pip package. As Keras is only a front, we also have to install a possible backend to use TensorFlow, CNTK, Theano or MXNet.
Tutorials are one of the starting point to test and learn a new Deep Learning framework. Currently, the easiest tutorial found on all deep learning frameworks, is hand written digit recognition with the MNIST Database.
This is the minimum required tutorial. This allows to have a glance at notation and API usage. As every framework has this tutorial, it can be used to compare usage.
Other frameworks have more complete tutorials. Since the release of their new imperative API named Gluon, MXNet has worked on tutorials that explain most of the deep learning concept and go to advanced usage. Moreover, with python API, it is possible to write Python Notebook. It’s a really powerful tool to explain and illustrate code.
Advanced users usually don’t need easy tutorials. They search, most of the time, for complete and useable examples. More examples are useful to understand complex framework functionality in real application.
Thanks to the community, it is possible to find implementation of new and ground-breaking architectures like ResNet, Faster RCNN, SSD, LSTM. The most used and popular frameworks have their own implementation and other can import networks from Caffe or TensorFlow.
I work in computer vision. Therefore, the possibility to have working implementation of Faster RCNN, SSD, MTCNN or even MobileNet is wonderful.
There is a counterpart: while it is easy to find lots of implementations, some of them are not maintained by their author. And they can be deprecated by the last framework release. To avoid this, MXNet and some of other frameworks, maintain in their main repository some of the hyped code like SSD or Faster RCNN.
TensorFlow uses a separate repository (Models) that was not well referenced by Google :) It takes lot of work to keep these codes up to date and compatible with each new framework version.
What is a Zoo?
If you want to train a network on new data, it is mainstream to use pre-trained network. This allows to reduce training time, globally reduce over fitting and increase performance.
These networks are trained on very large databases like ImageNet. But training time and calculation power to obtain this kind of network is not accessible to everyone.
To give access to such pre-trained networks to their user, frameworks either trained these models or developed a converter from other framework. These repositories with pre-trained models are called Zoo.
Easy to use, Easy to understand!
This criteria is really personal. But an easy to use and easy to understand API, will reduce the learning curve and help your team or new member to efficiently work with it.
There is low level API and high level one. TensorFlow is relatively low level, Keras high level and MXNet a mix between high level and low level. MXNet have currently several APIs, the Symbol API for define-and-run network, NDArray (low level) and Gluon (high level) API for define-by-run network. One advantage of having this range of APIs is that if we have questions, we don’t have to search or ask multiple times on different platforms.
TensorFlow API is not easy to use, that’s why the community creates lots of higher level ones like TF learn, TF slim, Sonnet, Keras, … These higher level APIs are easier to use but are not always as well documented and maintained as the core TensorFlow. I hope that will change with the official integration of Keras in TensorFlow.
Keras is a really good API. It’s a API on top of backend. And Keras can be used with TensorFlow, Theano, CNTK and even MXNet. Keras allow a really fast prototyping, it’s easy to create complexe neural architecture.
However, since it has a separated backend, it can be tricky to do low level operations. For example, multi-GPU management. In this case, we need to use the backend directly (aka TensorFlow/Theano/CNTK/MXNet).
Activity, Evolution and Community reactivity
Due to Deep Learning activity, new architectures are released every month and new breakthrough network every 6 month. Deep Learning frameworks need to be really active.
As a recent example, the release of Mobilenet developed by Google. This network use a not regular layer call “depth-wise separable convolution”. So, to implement this network with full training performance, we need either to implement this layer or wait for a maintainer to implement it.
Unfortunately, these state of the art algorithms generate user flow between frameworks.
About Multi GPU/Multi Computer
MXNet is known for it’s GPU management. It is easy and performant in comparison to Tensorflow or Keras.
As Keras can be used with MXNet backend, Keras in this particular configuration can easily scale on multi-GPU (Keras with MXNet backend).
Working on embedded devices
If you work on smart devices, network embedding can be a problem. Not all frameworks can be used on mobile or low performance platform.
MXNet solves this with an amalgamation process. This script concatenate MXNet functionality in a small C++ API. This interface can only use predictions but has almost no dependency (only Openblas). This API is wrapped in many languages Java, JS, Scala, … and can be embed in many platform iOS, Android, Browser.
What about Speed?
There are many benchmarks available. But unfortunately, as frameworks are really active, benchmarks are not always up to date.
Check out this active Deep Learning benchmark.
In conclusion …
MXNet is a really good framework. I think it deserves more attention.
Please take some time to use it, test it, it’s the best way to make up your own mind.
Are you curious about developing a Deep Learning project with MXNet ? Check also out the article below :
In this post, we’ll discuss and illustrate a fast and robust method for face detection using Python and Mxnet. At Wassa…medium.com
Do you want to know more about Wassa?
Wassa is an innovative digital agency expert in Indoor Location and Computer Vision. Whether you are looking to help your customers to find their way in a building, enhance the user experience of your products, collect data about your customers or analyze the human traffic and behavior in a location, our Innovation Lab brings scientific expertise to design the most adapted solution to your goals.
Find us on: