The First Week of General Assembly DSI and Some Basic Python

If only it was that easy… :)

Hi everyone! Welcome back.

After the pre-work I’ve done to prepare for my data science immersive class, the program finally started. The first week was mostly about the basics — solidifying the material from the pre-work I’ve discussed a little on my previous blog post (clickable link). We have gone through working with the terminal (on Mac, it’s a little different on Windows), working with git and GitHub, and Python basics — the very foundation of the class. Almost everything we do is done using Python so I thought I’d focus on some basic Python on this post.

According to Wikipedia, Python is a “ widely used high-level programming language for general-purpose programming, created by Guido van Rossum and first released in 1991[…] Python has a design philosophy which emphasizes code readability[…] and a syntax which allows programmers to express concepts in fewer lines compared to other languages.” The main idea here is that Python syntax is relatively simple and more or less makes sense when you read it, even if you’re not a computer :)

I say more or less, because this was me most of the time! (still is…)

If you made it here after reading the first post, I assume that you know a little bit about Python and how to work with it. What I’d like to do is to go over two of the homework assignments I got on the first week and how I solved it. So, we got a list of dictionaries and we had to write/code functions to do certain things. On the second version of the assignment, we had to solve the same problems but this time by using list comprehensions.

This is the list of dictionaries:

# Dictionary of movies

movies = [
{
"name": "Usual Suspects",
"imdb": 7.0,
"category": "Thriller"
},
{
"name": "Hitman",
"imdb": 6.3,
"category": "Action"
},
{
"name": "Dark Knight",
"imdb": 9.0,
"category": "Adventure"
},
{
"name": "The Help",
"imdb": 8.0,
"category": "Drama"
},
{
"name": "The Choice",
"imdb": 6.2,
"category": "Romance"
},
{
"name": "Colonia",
"imdb": 7.4,
"category": "Romance"
},
{
"name": "Love",
"imdb": 6.0,
"category": "Romance"
},
{
"name": "Bride Wars",
"imdb": 5.4,
"category": "Romance"
},
{
"name": "AlphaJet",
"imdb": 3.2,
"category": "War"
},
{
"name": "Ringing Crime",
"imdb": 4.0,
"category": "Crime"
},
{
"name": "Joking muck",
"imdb": 7.2,
"category": "Comedy"
},
{
"name": "What is the name",
"imdb": 9.2,
"category": "Suspense"
},
{
"name": "Detective",
"imdb": 7.0,
"category": "Suspense"
},
{
"name": "Exam",
"imdb": 4.2,
"category": "Thriller"
},
{
"name": "We Two",
"imdb": 7.2,
"category": "Romance"
}
]

The first question was to write a function that takes a single movie and returns True if its IMDB score is above 5.5. The first function I wrote:

def single_score(movie):
for m in movies:
if m["name"] == movie and m["imdb"] > 5.5:
imdb_score = True
elif m["name"] == movie and m["imdb"] < 5.5:
imdb_score = "Sorry, score less than 5.5"
return imdb_score

The function single_score takes one argument (movie) and iterates through items (dictionaries) in the list “movies” (assigned as m), if the name in the dictionary equal to the argument (“movie”) and the imdb score in the dictionary is above 5.5, the function return a variable called “imdb_score” that assigned to “True”. If the imdb is lower than 5.5, the variable “imdb score” is assigned the string “Sorry, score is less than 5.5.”

The list comprehension way:

def higher_score(movie_name):
return ['True' for movie in movies if movie["name"] == movie_name and movie["imdb"] > 5.5]

The second question was to write a function that returns a sublist of movies with an IMDB score above 5.5. The first function I wrote:

def movies_above(movies):
movies_above = []
for m in movies:
if m["imdb"] > 5.5:
movies_above.append(m["name"])
return movies_above

movies_above(movies)

The function movie_above takes one argument — movies (which is actually a list), that first of all creates an empty list. After that, it iterates through every item (each dictionary) in movies and examines whether the IMDB score is above 5.5. If it is, the function adds (appends) the name of the movie with that score to the list “movies_above”. Then the function returns movies_above. The list I got :

[‘Usual Suspects’, ‘Hitman’, ‘Dark Knight’, ‘The Help’, ‘The Choice’, ‘Colonia’, ‘Love’, ‘Joking muck’, ‘What is the name’, ‘Detective’, ‘We Two’]

And in list comprehension:

def MoviesGreaterList(movies): 
movies_greater = [movie["name"] for movie in movies if movie["imdb"] > 5.5]
return movies_greater

The function takes one argument, movies (a list). The next line adds the name of the movie to a list if its score is greater than 5.5. In the end, return the list of movies. The list I got:

['Usual Suspects', 'Hitman', 'Dark Knight', 'The Help', 'The Choice', 'Colonia', 'Love', 'Joking muck', 'What is the name', 'Detective', 'We Two']

As you can see, it’s the same list!

The third question was to write a function that takes a list of movies and computes the average IMDB score.

def movies_average_score(movies_list):
movies_scores = []
for movie in movies_list:
score = movie["imdb"]
movies_scores.append(score)
average_score = sum(movies_scores) / len(movies_scores)
return average_score
total_movies_average = movies_average_score(movies)
print(average)
6.48666666667

The function movies_average_score takes a single argument, a list of movies and creates an empty list, movies_scores. Then the function iterates through every movie (dictionary) in the list, assigns the IMDB score to the variable score and appends score to the list movies_scores. Then it calculates the average score by summarizing the scores in the list and dividing it in the length of the list. The function returns the variable average_score.

In list comprehension:

def AvgScore(movies): 
scores = [movie["imdb"] for movie in movies]
return (sum(scores)/len(scores))

AvgScore(movies)
6.48666666667

The function takes one argument, list of movies. It adds the movie IMDB score for each movie in the list of movies. Then, the function returns the average by dividing the sum of the scores list by the length of the list. As you can, the same average, of course.

The fourth and last question was to write a function that takes a category name and returns just those movies under that category. The 1st function:

def movies_category(category):
movies_category = []
for m in movies:
if m["category"] == category:
movies_category.append(m["name"])
return movies_category
romance_movies = movies_category("Romance")
print romance_movies
['The Choice', 'Colonia', 'Love', 'Bride Wars', 'We Two']

The function movies_category takes one argument — category. It creates an empty list (movies_category) and then iterates through every item (m) in movies. If the movie belongs to the category that we put in the function (category), that movie name is added (or appended) to the list movies_category. The function then returns that list.

The same function with list comprehension:

def CategoryList(category): 

category_list = [movie["name"] for movie in movies if movie["category"] == category]
return category_list
CategoryList("Romance")
['The Choice', 'Colonia', 'Love', 'Bride Wars', 'We Two']

By using list comprehension, we create a list of movie names by going through the movies dictionary and check if the movie is in the input category (the argument). The function then returns the list of movies from a specific category.

That’s it for today! I hope you enjoyed this post and learned something. I’d love to hear your comments, notes, questions and suggestions for topics. See you next time!