A minimal, production-ready, NN-based recommender system
Some time ago I wrote about the surprizingly good performance of a really simple autoencoder-based recommender build with keras as compared to a classic IBCF algorithm. After obtaining this result I thought it would be interesting to create a fully automated recommender system that would boast a minimum set of functions allowing to gather data, create a model, serve the model via an API, and deploy all those easily. The resulting system is built with Docker and can be deployed in no time, although it does lack some important features as control user interface (frontend), model registry, authentification, concurrent requests and probably many more. The project is open to anyone interested in using and/or contributing to it.
The architecture is really simple and essentially consists of three docker images: db-engine, train-engine and model-engine, and a shared volume to store all the data. Images are being built with gitlab CI/CD pipelines on each merge with master, so deployment to any Linux/Unix like machine is a breeze.
The benefits of having three different images are that changes can be made independently to different parts of the system. For example, if the bundled SQLite is not enough at one point, it can be easily swapped with any other database engine ensuring the same API format.
Data of format {"items":["beer","diapers","chips"], "user":"father"}
gathered one by one, on in batch by the recordInteraction endpoint of db-engine, and stored to the database table (database is created on container startup if not yet available). There are a couple of stats endpoints such as listUserCount and listItemCount to quickly get sense of what is happening on the database level, as well as listItems that is further used by train-engine to get the data for model training.
train-engine itself is a docker container, that trains a model on schedule or by API command and saves it to disk. There is an option to load an existing model and continue training, which would speed up the training process but it is disabled by default. train-engine requests data in batches an trains until the loss function stops improving on the test set, so in theory it can work with whatever large data.
Once the model has been trained and saved to disk, it can be picked up by the model-engine that does the job of serving the model via the proposeInteraction endpoint, format: {"items": ["wine","cheese","olives"], "k":3}
where k is the number of proposals to return given the existing items specified in the items list. Besides serving it can be used to assess the performance of the model by masking an item from the itemset and making prediction on the remaining items with the goal to recover the masked item.
Putting it all together in a docker-compose file in order to deploy:
version: '3'services:
db-engine:
image: registry.gitlab.com/airecommend/db-engine
volumes:
- air:/data
ports:
- "8001:8000" train-engine:
image: registry.gitlab.com/airecommend/train-engine
volumes:
- air:/data
ports:
- "8002:8000"
model-engine:
image: registry.gitlab.com/airecommend/model-engine
volumes:
- air:/data
ports:
- "8003:8000"volumes:
air:
This brings up all the three containers exposing APIs to gather data, make predictions and train models.
Each service has its own swagger page listing the available endpoints, with ability to quickly look up syntax and try requests