File conversion microservice

Scott Batary
batary
Published in
3 min readAug 31, 2017

--

File conversion can be a pain. Especially when you’re trying to automate it. That’s why I created Versed, a microservice specifically for that purpose.

Versed exposes a web API for converting files, and also comes with a simple web frontend for manual file conversion.

It’s currently powered by LibreOffice and FFmpeg, which means it supports the same file formats that those tools support, but you can easily add more tools to its arsenal.

Some example formats:

txt rtf doc docx ppt pptx xls xlsx csv html pdf 
latex bib bmp svg eps tiff png jpg gif apng
mp3 ogg mp4 avi webm mkv mov flv

Try converting an mp4 file to a gif or apng.

Or generating a thumbnail from a PowerPoint, Word, or PDF file.

Getting started

You can get the microservice up and running with just a few simple commands. Since the service has a dependency on LibreOffice and FFmpeg, it’s easiest to develop and run the application through a Docker container with all of those dependencies already built in. A Dockerfile is provided in the repo to get up and running. Just run the following commands.

git clone https://github.com/sgbj/versed.git 
cd versed
docker build -t versed .
docker run -d -p 3000:3000 versed

Then open a browser window and go to http://localhost:3000/.

How it works

The microservice was built with Node.js and Express.js. It exposes a single web API endpoint at http://localhost:3000/convert. That endpoint expects a content type of multipart/form-data, with a file and a format field. If the conversion is successful, it'll return the converted file, otherwise it'll return a 500 status code.

The microservice is based on middlewares that actually perform the conversion process. Currently there’s a middleware for LibreOffice’s soffice CLI (for documents, spreadsheets, graphics) and FFmpeg’s ffmpeg CLI (for audio and video). Each middleware is responsible for determining whether it can convert the file, and calling the next middleware in the pipeline.

On application startup, all of the middleware in the middleware folder are added to the pipeline for processing.

When the API gets a request at the convert endpoint, it puts some information together about the file, such as its name and mimetype. It then runs through a pipeline to determine which middleware can process the file. The middleware will attempt to convert the file and the result will be used to generate the API response. More middlewares and tools can be added easily to support even more file formats.

Check out the source code for this project on GitHub!

--

--