Network Science of American Home Foods
The Nodes and Edges of Recipes and Ingredients
Half a century after being born into this world, I am still a pathetic cook. I can barely tell the difference between pastry, pasta, and pâté. Luckily, as a data scientist and the author of Complex Network Analysis in Python, I don’t even need to know how to fry an egg to speculate about cooking. I can let complex networks do the job and gain insight into data science and American home cooking at the same time.
To get started, let’s create a complex network of ingredients frequently used together and explore its structure in the hope of serendipitous insights. Just like any other complex network, our network consists of nodes and edges. A node represents an ingredient (such as active yeast, all-purpose flour, butter, tuna, ostrich eggs, and escarole — whatever it is). Two nodes are connected if the ingredients are frequently mentioned together in the recipes — that is, they are frequently used together. I would expect flour and butter to be connected, but not active yeast and tuna.
Finding Recipe Data Sources
Let’s start with harvesting a list of recipes from AllRecipes.com (SimplyRecipes.com is another good choice).