How Wiris Delivers Handwritten Math Recognition within MathType Using MXNet

Published in

Apache MXNet

5 min readMar 4, 2020

Our challenge: Building reliable and cross-platform deep learning solutions

Our Data Science team at Wiris builds deep learning solutions with MXNet under the hood. Our most remarkable solution deals with the recognition of handwritten math expressions within our equation editor MathType.

A user wants to include Euler’s equation in a paper. They use the handwriting-based interface and MathType provides real time math recognition. After this step, the expression is ready to be exported to their desired format (PNG, SVG, LaTeX or MathML, among others)

MathType allows composing any math expression exporting to high-quality graphics. It comes with two user input interfaces: a toolbar-based WYSIWYG editor and a canvas. This canvas integrates a handwritten math expression recognition model.

When we started developing the handwritten math recognizer we identified some requisites. In short, our solution needed to:

Be reliable, stable and fast.
Run inference on-premise on smartphone devices (iOS and Android).
Be easily deployed everywhere (provide one binary executable without dependencies).
Run inference as a SaaS on different server flavors (Windows and Linux).

Why Apache MXNet?

Apache MXNet is a powerful open-source deep learning framework, which provides the tools to define, build, train, and deploy neural networks. It is flexible thanks to the Gluon API that greatly simplifies the process of creating deep learning models without sacrificing training speed.

MXNet helps with easy prototyping and validation of neural network models. At the same time, when speed becomes more important than flexibility — e.g. training time — Gluon allows for the caching of models to achieve high performance with a reduced memory footprint.

Another one of the key features that made us lean towards MXNet is that models are easily deployed into different platforms. Thanks to the APIs across popular programming languages, we can train a model using Python, and deploy it on edge devices such as smartphones, desktop and cloud servers using C/C++ and Java. And as an added extra, models fit in very small amounts of memory!

Training

In order to train the models used in these solutions, we have developed a high-level toolkit in Python that uses the Gluon API to deal with our particular problem. We deal with “on-line” data, which in the handwriting field is jargon for a time series of (x,y) coordinates. Although Gluon data loaders didn’t cover our needs for this, it was quite easy to implement our own on-line data loader by extending the base data loader and following the defined API. Similarly, we defined and implemented data augmenters for our type of data.

Finally, although currently implemented as Gluon Fit, back when we started developing our toolkit, training a model in Gluon required users to write the training loop, calculate metrics, test results in validation, implement early stop criterion, etc. Our toolkit deals as well with model hybridization, exporting and facilitates the training and validation loops.

Deployment

At the moment, we deploy our models using the MXNet C API. We do this by using the amalgamation generation script to obtain a single source code file that can be statically linked when compiling our inference program. However, obtaining the amalgamation isn’t straightforward.

Currently evaluating: MMS and Apache TVM

What is MMS?

Multi Model Server (MMS) is an open-source, flexible and easy-to-use tool for serving deep learning models. It is production-ready and allows you to easily load deep learning models. It provides an API endpoint per model to run inference and retrieve the results.

Why are we trying MMS?

Our data science team is small and as we mentioned early we struggle with maintaining complicated production stacks. MMS provides us with an end-to-end solution to serve our deep learning models as a SaaS. We only need to package the inference code with the model and just deploy it.

Preliminary testing shows that it’s faster than our current deployment. Additionally, MMS built-in server logging is quite powerful, making error tracking much easier.

What is Apache TVM?

TVM is an open-source deep learning compiler stack models for CPUs, GPUs, and specialized hardware. Its purpose is to close the gap between the productivity-focused deep learning frameworks, and the performance- or efficiency-oriented hardware backends.

Why are we trying Apache TVM?

TVM offers an interesting value proposition for us. First, it compiles deep learning models developed in MXNet into minimal deployable bundles optimized for our diverse deployment platforms. Secondly, there are many testimonials to the efficiency gains of using TVM with respect to MXNet vanilla deployment. On average, deployment with TVM could boost performance by a factor of 2.

However, our impression is that TVM support at the moment is not mature enough. We hope that in the future TVM can greatly simplify our deployment, helping provide consistent results across different backends and devices.

Final thoughts

We started working with MXNet in version 0.9 about 3 years ago, before the Gluon API existed. After we ported the early stages of our toolkit to Gluon we are quite happy with MXNet’s stability. We have always been excited to hear about a new MXNet update given that each major version brought tangible performance improvements, quality of life features — like the swift implementation of state-of-the-art blocks — and, unlike other deep learning frameworks, this one always preserves compatibility.

Overall, quite a pleasant experience. Keep the good work, MXNet team!

Wiris is a software company that develops B2B and B2C solutions for the Educational Technology Industry (EdTech). Our main goal is to offer advanced calculation and presentation tools for maths education with emphasis on internet technology solutions. The company was founded in 1999 by teachers and former students from Universitat Politècnica de Catalunya (UPC) to commercially exploit a research project that developed a next generation computer algebra system. As of the writing of this post, the company has about 50 employees.

The products currently commercialized by Wiris are: MathType (equation editor), Wiris Quizzes (automatic assessment tool of math questionnaires) and CalcMe (online calculator powered by a computer algebra system and a graph engine).