Let your users do your backend job

Browsers are now way more powerful than they used to be a few years ago. Tasks that we used to perform on server-side can now be “outsourced” to the end-users. Saving storage, bandwidth and precious electrons on the server-side.

Maciej Brencz
Legacy Systems Diary
3 min readMay 8, 2018

--

At Wikia we’re currently in the process of migrating our old Apache / PHP setup into Docker containers that can be run on Kubernetes. As we’re going through myriad of extensions and old features we keep stumbling upon interesting things.

How we used to render mathematical formulas

One of the features that we support is the “math” tag used to render various mathematical formulas expressed in LaTeX. They can vary from 2+2 to complex integrals and matrix operations.

Integrals rendered as a PNG

We were using texvc binary running on the server to render user-provided formula to a transparent PNG file. It was then fetched by the browser, consuming bandwidth and disk space on our internal files storage cluster.

In order to prevent rendering of the same formula repeatedly, the MediaWiki extension was relying on a per-wiki “math” table keeping a reference to a render PNG based on a hash of LaTeX formula source. There was no garbage collection mechanism - formulas rendered five years ago (and highly likely no longer used) are still there. “math” tables across all wikis weight over 50 GiB.

texvc in Docker container?

Apache nodes that are used to handle web requests have texvc Ubuntu package installed. It‘s quite small and simple. However, its dependencies (including imagemagick and ghostscript) are not. Actually, when we added “apt-get install texvc” to a Dockerfile over 900 MB of packages were fetched and the Docker image size went up from 455 MB to over 2.3 GB!

Building and deploying such huge Docker image is not the best idea. However, we had to keep this feature up and running. But also keep the Docker image at the reasonable size.

Rendering LaTeX client-side

Then we found that the MediaWiki Math extension has support for MathJax JavaScript library that does the same what texvc did, but in a completely different way — it renders LaTeX formulas in the browser. All it needs is a LaTeX syntax inside HTML node.

As MathJax uses HTML (or SVG) to render formulas, they look waaaay better than transparent PNG files (that looked ugly on wikis with dark background and do not scale well).

Nice looking integrals rendered on client side — HTML or as SVG (up to user to decide). Who said that mathematics is not pretty ;)

Conclusions

By moving LaTeX formulas rendering from backend to end — users browsers we removed quite heavy dependencies that were installed on Apache nodes and would need to be a part of a Docker image.

  • we removed database tables keeping references to formulas rendered on the server side — 52.8 GiB and 230k queries a day saved
  • over 150k images with rendered formulas to be removed (we estimate that they consume ~400 MiB of storage space)
  • LaTeX in plain text weights a bit less than a PNG file :)
  • and it looks way better when rendered as a vector graphic

--

--

Maciej Brencz
Legacy Systems Diary

Poznaniak z dziada-pradziada, pasjonat swojego rodzinnego miasta i Dalekiej Północy / Enjoys investigating how software works under the hood