Keras-MXNet now adds Sparse and Full RNN Support

Published in

Apache MXNet

4 min readOct 30, 2018

In May 2018, we added MXNet backend added support for Keras 2. Keras-MXNet delivers improved performance for both inference and training, with more than 2X boost for GPU training on convolutional networks, and support for multi-GPU training. Since then we have been working on adding more operator support and some exciting new features.

Keras-MXNet recently announced v2.2.4 and v2.2.4.1 releases with significant functionality including full RNN support and sparse tensor support. The latest release extends the ability to run multi-GPU training to include sparse tensors and adds support for sparse categorical cross entropy loss.

Sparse Tensors

One of the major features we have added to this release is sparse tensor support with CPU, GPU and multi-GPU on the MXNet backend. Users can now use sparse tensors for building recommender systems and various machine learning models related to NLP and factorization machines with Keras-MXNet.

In this release we added sparse support forsum, mean, concat, dot and embedding operators. We also added sparse_weight support to Dense Layer. Now you can specify your input and layer weights to be sparse tensors.

We benchmarked the performance of sparse tensors with a basic linear regression model using sparse synthetic data. Benchmark results show significant performance boost with large batch size (greater than 128 samples) for speed and memory usage.

Following image shows performance for training with increasing batch sizes GPU with sparse tensors:

Next, we see inference performance on CPU with increasing batch sizes.

For a more detailed overview, please see training and inference benchmark results with sparse data. More documentation on sparse usage in Keras-MXNet is available in the repository.

Full RNN Support

Keras-MXNet now supports these RNN features with the new release:

(1) Variable input length support: Input length specification or padding is not necessary

(2) RNN without unrolling is supported, which allows to support larger RNN layers with smaller memory footprint

3) RNN Dropout layer is now supported

Now users can use RNN layers without any change, providing the same experience as TensorFlow backend. To know more details on when and how to use unrolling in Keras, please read our RNN documentation.

We implemented the above functionalities using MXNet control flow operators which were introduced the latest PIP package of MXNet. To know more about MXNet control flow operators please take a look at our design documents: MXNet Control Flow Operators, Keras-MXNet RNN with control flow operators.

Sparse Categorical Cross Entropy

In addition, we have added support for two variants of categorical cross entropy. Both of them can help you speed up loss calculation when training classification problems. The first one is sparse categorical cross entropy, it’s useful when your labels are mutually exclusive where each input only belongs to one class. You can apply one-hot embedding on your training labels and use this loss, it will give you around 2X speed up.

The second one is multi hot sparse categorical cross entropy. It’s suitable for multi-labeled multi-class classification. For example, each input can belong to 3 to 5 classes, but there are total 1000 different classes. Please follow this example for usage, and if you are interested to learn more about its implementation, please find more details in our design documentation.

Getting Started with Keras-MXNet

Can’t wait to try out the new release? Be sure to check out the installations page for instructions on setting up Keras-MXNet on your dev machine. PyPI packages are available for installing Keras-MXNet on Mac, Windows and Linux.

All you have to do is run:

# Install MXNet for CPU machine
pip install mxnet-mkl# Install MXNet for GPU machine with cuda 9.0
pip install mxnet-cu90mkl# Install Keras-MXNet
pip install keras-mxnet

Note: MXNet installation document lists all MXNet packages that can be used with Keras-MXNet
For using the new RNN and loss features released with Keras-MXNet 2.2.4, please use the latest MXNet package: pip install mxnet-mkl --pre

To learn more about Keras-MXNet, be sure to check out other related blog posts — deploying Keras-MXNet model on MXNet Model Server and training using Keras-MXNet and inferring using MXNet Scala API.

If you like what you read — follow the Apache MXNet channel on Medium to be updated on the latest developments of the MXNet ecosystem. Keras-MXNet is an open source project and contributions to the repository are always welcome!

Thanks to: Wei Lai, Sandeep Krishnamurthy

Keras-MXNet now adds Sparse and Full RNN Support

Sparse Tensors

Full RNN Support

Sparse Categorical Cross Entropy

Getting Started with Keras-MXNet

Written by Kalyanee Chendke