Decyphering recipes

Irene Iriarte Carretero
Gousto Engineering & Data
4 min readMay 18, 2017


Last month we got the chance to present some of the work we have been doing at a neo4j meetup. In the lightning talk, I explained how neo4j has been helping us with our automatic menu planning process, a tool we use to predict similarity between different recipes. You can see the slides at the bottom!

Our weekly menus are a vital part of our business and we need to make sure we are offering customers balanced menus which give them the choice to pick recipes they really want. Therefore, a lot of thought goes into planning these menus, which need to fulfil several operational constraints and offer a range of ingredients and cuisines. With 22 recipes currently on the menu, the planning was a time consuming task and the process was also not as data-driven as it could be!

The solution proposed by the Data team was an automated menu planning algorithm, which consists of:

  • Graph Database: this is a database containing all our knowledge on the recipes and ingredients and how they are connected.
  • Genetic Algorithm: it is a type of algorithm used for multivariate optimisation, which follows the same idea as biological evolution (mutation, cross-over between solutions). The algorithm ensures that any final solution already fulfils any constraints. In our case, we wanted to make sure that we provided a variety of different recipes so our genetic algorithm was set to minimise for total recipe similarity.

Putting a number to the similarity between two recipes is actually quite complicated, due to the subjective nature of the matter. While simply counting ingredients in common between recipes would provide us with a simple approach, we feel that it does not capture true similarity as there are many other things that play a part, such as presentation, cooking techniques and cuisines. If you look at the presentation below, this is the point I am trying to convey in the chicken slides!

The graph database allowed us to look at recipes and ingredients from these different point of views and gave us:

  • Flexibility: a flexible structure was key to being able to describe ingredients and recipes. For example, it is possible to give an ingredient the property “spicy” without having to worry about other ingredients not having this particular property. We also loved being able to assign properties to the relations themselves.
  • Easy way to create inferences: with explicit connections between the data, it is easy as pie to write a script which automatically gives a recipe the property of “not suitable for little people” if it contains any ingredient with the aforementioned “spicy” property.
  • Speed: a graph database only looks at connected entities- there is no need to search through all the data! This significantly speeds things up when comparing to a relational database.
  • Cypher: the easy and intuitive language to query the graph meant we could analyse the data quickly and efficiently.

Once the graph database was set up, we calculated similarities between recipes by counting paths between them and assigning weights to the different recipe attributes. To get some training data, we set up a bot on Slack which every so often asked colleagues to rate the similarity between certain recipes (we love our bots at Gousto!). With this data, we then benchmarked results from our algorithm with human responses, which helped us capture people’s different points of view.

While we are currently in the process of fully implementing our menu planning algorithm, we are already thinking about how else we can leverage our graph database. The first and most obvious case is that it will be the basis of our recommendation engine, which will require us to add a host of more data such as customers’ taste scores. The graph database will also help our customers to do much more advanced searches on our recipe library, allowing them to find exactly the recipes they want!

All in all, presenting at the meetup was a great experience. As well as interesting feedback and ideas, we also got to see more presentations and how other companies are using graph databases.


In return for giving the talk, neo4j kindly provided us with tickets for GraphConnect which we attended last Thursday. We had a great day listening to all the different talks (the hardest bit was choosing which ones to attend!) and came back to the office inspired by all the possibilities that graph databases offer. We got some useful tips about recommendation engines and about how to further integrate the graph database into our platform and we can’t wait to get stuck in. Thanks again for the opportunity to be a part of it all. Most importantly, we discovered that we contributed to mankind getting to Mars and if that’s not something to be excited about then I don’t know what is!


If you are interested in having a look at the actual slides, they are attached below!

Data Scientist

Originally published at on May 18, 2017.