Apache MXNet 1.6.0 release is now available
Author: Omar Orqueda
Editors: Lin Yuan, Przemek Tredak, Vishaal Kapoor, Nathalie Rauschmayr
Today the Apache MXNet community is pleased to announce the 1.6.0 release of Apache MXNet Deep Learning framework. We would like to thank the Apache MXNet community for all their invaluable contributions towards this MXNet 1.6.0 release.
This release brings many new features to our users listed below. For a comprehensive list of major features and bug fixes, please check out the full MXNet 1.6.0 release notes.
Better NumPy-compatible Interface
NumPy has been widely adopted by data scientists, machine learning practitioners, and students for its flexibility and generality. In fact, learning NumPy is often the first step to get started with scientific computing. We are pleased to announce that MXNet is moving toward a NumPy-compatible programming experience with MXNet 1.6.0 to provide equivalent usability and expressiveness and facilitate the development of Deep Learning models to practitioners familiar with NumPy syntax. In addition, MXNet enables the existing NumPy ecosystem to utilize GPUs and accelerators to speed up large-scale computations. You can find a good introduction following this link: A New NumPy Interface for Apache MXNet.
Integration with Apache TVM for Operator Implementation
In MXNet 1.6.0, users can leverage TVM to implement high-performance operator kernels in Python. This new feature simplifies the previous C++-based development process. For instance, broadcast add looks like:
In addition, this feature enables sharing the same implementations across multiple back-ends, such as CPU, GPU, etc.
Additional Feature Improvements and Bug Fixes
In addition to the features already listed, we have the following improvements on existing features, performance, documentation, and bug fixes:
Graph optimizations: MXNet 1.6.0 reduced computational intensity by grouping element-wise operation and simplifying common expressions:
- Element-wise operation fusion for GPU: The performance of element-wise or point-wise operations is memory-bandwidth bound. Therefore, chaining multiple such operations may reduce overall performance for the unnecessary stores and loads of intermediate results. Element-wise operation fusion generates just-in-time fused operators whenever possible, reducing storage needs and improving overall performance and memory usage. For example, we have observed a 13% speed up inference with ResNet-50 using NHWC data layout for convolutions and batch size of 128 (930 img/sec without fusion vs. 1050 img/sec with fusion)
- Elimination of redundant expressions: This enhancement simplifies redundant computations improving memory usage and total execution time. For instance, the execution graph generated by the Python code out = (a + 5) × (a + 5) will compute (a + 5) twice. Elimination of redundant expressions will produce an execution graph equivalent to b = a + 5; out = b × b, computing (a + 5) just once.
Optimizations and improvements: MXNet 1.6.0 provides several optimizations and improvements to existing features and operators, such as
- Automatic Mixed Precision
- Gluon Fit API
- MKL-DNN
- Large tensor support
- TensorRT integration
- Higher-order gradient support
- Operators
- Operator performance profiler
- ONNX import/export
- Improvements to Gluon and Symbol APIs
- Over 100 bug fixes
Support for new operators: MXNet 1.6.0 adds several new operators, such as RROIAlign, group normalization, and allclose.
Language bindings: MXNet 1.6.0 enhances language bindings for C/C++, Clojure, Julia, Perl, Scala.
MXNet 1.6.0 adds 15 examples/tutorials and over 25 documentation updates.
Please note that MXNet 1.6.0 release is going to be the last MXNet release to support Python 2.
How to Build MXNet
Please follow the instructions at
https://mxnet.incubator.apache.org/get_started/build_from_source
Users that build MXNet from source are recommended to build release 1.6.0 without jemalloc to avoid incompatibilities with LLVM’s OpenMP library, see details in issue #17043 and PR #17324. Please set USE_JEMALLOC to “OFF” in ./CMakeLists.txt with cmake builds (default) or set “USE_JEMALLOC = 0” in make/config.mk with make builds.
Getting Started with MXNet
To get started with MXNet, visit the installation page. To learn more about MXNet Gluon interface and Deep Learning, you can follow our comprehensive book: Dive into Deep Learning. It covers everything from an introduction to Deep Learning to how to implement cutting-edge neural network models. You can check out other related MXNet tutorials, MXNet blog posts, MXNet Twitter, Getting Started with GluonCV, and MXNet YouTube channel.
Have fun with MXNet 1.6.0!
Acknowledgements:
We would like to thank everyone from the Apache MXNet community for their contributions to the 1.6.0 release.