Nerd’s Guide to Choose Meal Recipes
Optimize meal recipe selection based on nutritious factors, with R
The intended audience for this short blog post is someone who has basic understanding in linear algebra and is keen to find out how to perform linear optimization tasks using R. While we use meal recipes as a demonstration here, other applications such as media channel selection could also be optimized using the same approach.
To determine the optimal 5-meal plan that maximizes daily protein intake while staying healthy at the same time.
In other words, what 5 recipes can give us the highest amount of protein, given the recommdended daily intake of calories, fat, and sodium?
The dataset can be found here on Kaggle. It is consisted of 20,052 recipes with 680 variables. The variables could be broken down into general (title, rating), nutrition (calories, protein, fat, sodium), and ingredient (beef, bean, crab).
For this blog post we will include only recipes with greater than 4.375 rating, as we found out that 8,019 (40%) recipes have 4.375 rating. We have also removed recipes with missing values.
In addition, we only use nutrition variables to perform the optimization task.
We will use the lpSolve package available in R to perform the task.
Here we need to set up an objective and a few constrains in order for the package to solve the linear equation.
Let us say our objective is to obtain as much protein as possible in 5 meals.
However, we would also like to be as healthy as possible. We will be using the suggested nutrition intakes below as our recipe constrains:
meal = 5
calories ≤ 2,500 kCal
fat ≤ 80 g
sodium ≤ 2,300 mg
Out of the 2,106 recipes, we have identified the optimal meal plan as below, with a total of 391 grams of protein.
This sums up for the short blog post. While we have only used a small portion of the variables in the dataset, we could also set up more realistic constrains such as include only the available ingredients in our frig or more personalized food preferences such as no beef or chicken only.
In terms of broader applications, we can apply this approach to solve business problems. For example, we can obtain performance data on digital advertisements for each channel or medium (impressions, clicks, and conversions) to determine the optimal marketing mix to achieve set objective.
This is also an example of using prescriptive analytics, which instructs us exactly what to do, as opposed to descriptive analytics that focus more on the statistical description of the data obtained, as introduced by Florian Teschner in his blog post on fantasy football data.
#Read in the csv file
food <- read.csv("food.csv")
#Check data formats
#Include only recipes with greater than 4.375 rating
food <- food[food$rating>4.375,] #2719x680
#Create dataframe that only contains nutrition-related variables
nutrition <- food[c("title", "calories", "protein", "fat", "sodium")] #2719x5
#Explore new dataframe
#Remove NA values
nutrition <- nutrition[complete.cases(nutrition),] #2106x5
#Set up objective as maximizing protein
obj <- nutrition$protein
#Set up nutrition-related variables as transposed matrix
matrix <- nutrition[,c(2,4,5)]
con <- t(matrix)
meal <- rep(1, nrow(matrix))
con <- rbind(con, meal)
#Create constrains on nutrition factors
dir <- c("<=", "<=", "<=", "<=")
rhs <- c(2500, 80, 2300, 5)
#Solve the linear equation
recommendation <- lp("max", obj, con, dir, rhs, all.bin=TRUE)
#Return the recommended meal recipes
nutrition$eat <- recommendation$solution
#Print out the recommended meal recipes
sum(nutrition[nutrition$eat == 1,]$calories)
sum(nutrition[nutrition$eat == 1,]$protein)
sum(nutrition[nutrition$eat == 1,]$fat)
sum(nutrition[nutrition$eat == 1,]$sodium)
paste(nutrition[nutrition$eat == 1,]$title, collapse=", ")
Questions, comments, or concerns?