Exploring Recipes using Network Analysis

Published in

INST414: Data Science Techniques

3 min readMay 2, 2024

Introduction:

There are countless recipes to try from regions and cultures around the world. Additionally, each recipe has its own variations that make it its own. Luckily, most of these recipes are available at your fingertips. There are millions of recipes online for people to try making on their own. I aim to answer which ingredients are the most used in recipes using network analysis. This question may be asked among grocery store owners. If there are certain ingredients that are the most common among all types of cuisines, they would want to know how to stock their stores. This question may be asked by most people that cook at home. Many people make their food in bulk at the beginning of the week. Knowing which specific ingredients can be used in multiple recipes can help them add variety to their diets and save them some money at the grocery store.

Data:

The data for this network analysis was obtained using the Spoonacular API. Some of the fields in this dataset include recipe title, cuisine, diet, equipment, ingredients, pantry items, various nutritional facts (sugar, sodium, fiber, etc.) and more. The main fields that were used are ingredients names and recipe titles. This necessary subset was collected using the requests library. I selected 75 random recipes from this API, even though there are thousands of recipes included. This is because I did not want the network to be heavily populated with nodes, making it had to visualize.

Nodes and Edges:

The nodes of this network are represented by ingredients. The size of the nodes depends on the instances of that ingredients in different recipes. The larger the node, the more instances of the ingredient. The nodes are connected by edges with are recipes. If the ingredients are within the same recipe, they will be connected.

Importance is determined by degree centrality. The nodes with the highest degree centrality are have the most connections to them, thus making them important. The three most important nodes are water, olive oil, and salt and pepper.

Although these are the 3 most important nodes, there are many other nodes (ingredients) that are present in many of the recipes, such as tomatoes, garlic, bell peppers, onion, etc. These are the answers to the question, which ingredients are the most used in recipes. Using this network analysis, store owners can understand how to stock their stores. Having an increase quantity of the most used ingredients than other less common ingredients based on the chart can save the store owner money and have less waste. In addition, it helps people that cook at home because having the most common ingredients in their kitchen can help them diversify their meals and help improve their diets.

Limitations:

The limitations falls with the data itself. I only picked 75 random recipes to make the network legible and easy to visualize. In reality, there are millions of possible recipes within many different cuisines. Since the recipes were random, choosing each possbile cuisine was not considered. To have a fair representation of all cultures recipe variations, they all need to be considered to help out the stakeholders.

Github:

https://github.com/dhruvitpatel5/Mod2Recipes/blob/main/mod2recipes.ipynb

Exploring Recipes using Network Analysis

Written by Dhruvit Patel