What is in our food?
Through the power of computers we have the answer! But first some back story…
At Foodsaurus we’re building an app to help travellers quickly translate food ingredients on the fly when travelling. Our aim: to create a reference guide that can be used easily whilst holding a food packet in your other hand.
So how did we plan to do this? Well, we want people to have to use the app as little as possible. This meant that the most likely ingredients in a product should appear at the top of the list.
We begun by separating food products into categories similar to supermarket aisles. This helps us filter the ingredients depending on what sort of product you’re holding.
We then set up a process to screen scrape the Ocado (an online food shop) website and apply a weighting system. We chose Ocado because they have the best metadata of the online shopping sites and product listings contain all the ingredients and any suitability for special diets.
What followed was a semi laborious process of data rationalisation to split the list of ingredients into its component parts. Then with a clever algorithm we worked out the top 500 ingredients in each category and then overall.
With pride I present the top 10 ingredients in processed food…
- Flavouring (this is an interesting one as it is usually followed by the actual flavourings in brackets so this is an example of one we may have to rationalise further).
- Wheat flour
- Emulsifier (again, this is sometimes (but not always!) followed by what the emulsifier actually is so we may need to rationalise this as well)
- Vitamin B
- Citric acid
- Rapeseed oil
There is still some work to do to rationalise further — e.g. sea salt and salt, should they be different or merged? There is also a distinct lack of standards as to how an ingredient should be presented so we will need to comb through and remove any synonyms that we don’t feel are useful.
We are then sourcing all the translations, initially from Google but then we will need to get confirmation that they have been correctly translated from people on the ground.
We also plan to try out the app further and get people testing it to make sure that the design and layout is easy to use on the fly in the shop.
And next next?
This is obviously something that would benefit from as much automation as possible and is also something that can learn more given more data. This means possibly adding more sites, more products and more rationalisation around ingredients.