Databases on NYC Restaurants

3 min readMay 17, 2019

Restaurants are practically everywhere in New York City. For many, its one of the only sources of food that they could get, whether its because of their inability to cook, or no timing to create said food. Restaurants are also seen as a luxury or a place to mingle with friends. So of course, for restaurants to have such a huge impact on peoples lives, these food places must be top notch. For some places, that might not be the case. New York has a way of grading restaurants, not necessarily by how good the chef’s food is, but instead by how clean the restaurants place is. These grades play a huge part into how much business the restaurant makes, and how well the food is received. With these grades, there are also databases that keep track of which restaurant has high grades, and where they are located. This particular database not only keeps track of the grades, but also keeps track of the number of restaurants in each borough, and how well each borough does based on the grades of the restaurants inside each borough.

We start by importing pandas, the usual things needed for databases. We also need to make sure that we have the database itself in the same folder. The database we are using is the New York City Restaurant Inspections database. This database has its location, the name of the restaurant, the type of food they specialize in, etc. By typing the df.info command, the database gets printed in normal text.

Its messy, huge, undesirable, etc. While its not as unorganized as most databases, its still just a large chunk of text. Many would never blink an eye towards it if this was given to them. Now its time to organize it.

Note: Not everything the database has is displayed in this image. There is more to this.

df.head is a command that only displays the head of the database. Here you can see the code read, organize, and print out a table of each category that was in the excel file. You can do the same with df.tail, where it shows the ends of the graph.

Now lets say you want a specific group of data. For example, you only want to see how many restaurants are recorded in each borough. Theres an easy way to do this: df.groupby()

Groupby tells the code to take a certain column and expand on it. Print out only the results that are linked with that column. Its an easy way to get what you specifically want out of such a large database, like this one. You can do this with every column that is available in the database.

Now you are able to organize a database just by code, without going into excel. Its a new way to navigate that doesn’t need to take much learning. It’s also much more accessible, considering that excel can be confusing and stressful sometimes.

Databases on NYC Restaurants

Written by Yasmin Salih