Picking Your Next Meal Like a Data Scientist

Published in

Analytics Vidhya

5 min readNov 27, 2019

This post was co-authored by my fellow Data Scientist Peter Garcia. You can check out his Medium account here.

So Peter and I are sitting there, working on our respective capstone projects. It is a Tuesday night and with just one week left to go in our data science program, we needed every minute we could get. That’s when hunger struck. As usual, with that hunger came the oh-so-lovely process of picking a spot to go for food. As always with these kinds of decisions, Peter and I were indecisive — so like any good data scientist, we “built a model” to remove our own personal bias.

… and by model, we mean using a simple function known as choice() from the numpy sub-package random to determine where to go.

import numpy as npfood = ['pizza',
        'chinese',
        'mcdonalds',
        'chicken place next to the arepa place',
        'dos toros',
        'chickfilla',
        'mexique',
        'smashburger',
        'shake shack',
        'shortys']# We clearly want THE healthiest of choicesnp.random.choice(food)

So after importing numpy in a jupyter notebook, we instantiated a list of food choices in the area. After that, numpy’s random.choice() randomly selected a place:

`mcdonalds`

Of course, we weren’t satisfied with the result as it was far too easy of a conclusion to reach. So we thought about a “first-to” scenario. And just like that, we had committed to complicating things, because why not. It’s not like we have a capstone to finish up or any other thing that’s time sensitive.

Since we’re creating a function, first thing’s first. Let’s define our function and its arguments. The food_list is the list of places we want to go and count is the number of instances we will observe in the “first-to” scenario — with the default being “first-to-three”.

def get_food(food_list, count = 3):

We will begin the function by using a dictionary comprehension to keep track of the count for each place we want to go eat. The keys of the dictionary are the locations and their respective value is the number of times it was chosen.

    food_dict= {item:0 for item in food_list}

Why a dictionary? Well, since we want to do a “first-to” scenario, we will eventually have those zeros converted into numbers representing the amount of times a particular key was randomly selected. A quick example would be if we had to pick the first to three between pizza and chinese, and pizza won 3 to 1, the dictionary would end up looking like:

{'pizza': 3,
 'chinese' : 1}

And since 'pizza' was randomly selected more times than any other option on the list, our function would return 'pizza' as the place to go.

The next step in our get_food function is a while loop.

while max(list(food_dict.values())) < count:
        increment = np.random.choice(food_list)
        food_dict[increment] += 1

In the code above, while the max value in our dictionary (initially 0 when we created the dictionary) is less than the “first-to” scenario (count= 3 ), we we will randomly pick a place from our food_list and increase its corresponding dictionary value by 1. The first food place to reach our desired count, will be our winner.

Finally, using a list comprehension, we need to end the function with a return statement that will give us what we want: where to go to eat!

return [place for place in food_list if food_dict[place] == count]

So putting it all together, the code looks a little something like this:

food = ['pizza',
        'chinese',
        'mcd',
        'chicken place next to the arepa place',
        'dos toros',
        'chickfilla', 
        'mexique', 
        'smashburger', 
        'shake shack', 
        'phillys']# Remember - healthy foods only!
def get_food(food_list, count = 3):
    
    food_dict= {item:0 for item in food_list}
    
    while max(list(food_dict.values())) < count:
        increment = np.random.choice(food_list)
        food_dict[increment] += 1
        
    return [place for place in food_list if food_dict[place] == count]

And now the moment of truth. We will pass the food list into the function and have it do a “first-to-15” scenario.

get_food(food, count = 15)

Which tells Peter and I that we should get food from…

['shake shack']

For future implementation of the function, we were thinking about having get_food start off with having the user provide options that we can then turn into a dictionary. From a coding perspective, this allows for scalability and a hand’s off approach, as we would not have to constantly go back to the code and amend the list — which would be a headache, given that locations can change, meaning that food options can change as well.

Also, we did not actually achieve the removal of any bias from our decision, as the food list started with a narrow list of options that came from our heads. One approach to make the list more objective may be to use location and time to populate a list of currently open restaurants — but that’s a tad bit involved, given that it’s 9:00 p.m. on a Tuesday night.

So what initially started out as two indecisive people coming together to make a decision after coding a np.random.choice() solution turned into a 30 minute troubleshooting session as we were building the get_food function out. Maybe we were bored, maybe we were too hungry to decide, or maybe…

Regardless of the why, all we know is that we want to enjoy our delicious shackburgers now…oh, and work on our capstones too!

Picking Your Next Meal Like a Data Scientist

Written by Fausto De La Rosa Mañón