Reflection on HOL #1
The data project seemed rather simple and brief from the summary provided to explain what the project would entail and from the start, my assumption was affirmed. From the start, we decided to collect only data from the top ten movies instead of the total gross for each year, understanding the time restrictions placed upon the project. The collection of the movie data took around thirty minutes as the site, boxofficemojo.com provided by Professor Hemphill presented my group with all the necessary data to find the trends in movies from 2004–2013. This project demonstrated the importance of utilizing online databases in the collection of data and not wasting your time trying to collect your own data, which would take days. The data project began to become difficult when I began to produce the python scripts that were to describe the collected movie data. The codeacademy tutorials assigned provided assistance in developing python programs to take the mean (average), standard deviation, and variance, but the python script that would describe the trend in genre popularity didn’t come with an example and took a few days to code including inputting the data values. We decided to make a function, which found the percentage that each genre made up of the total gross of the top ten movies of each year and this proved strenuous. Trying to develop a recursive function that didn’t have to be rewritten again and again was impossible for an individual of my limited skill set in python programming. Consequently, I had to type the same function repeatedly for every year and input the data. I know for sure that I will not be using python for the upcoming project. If given a chance redo project I would change my inattentiveness concerning the credibility of the site, boxofficemojo.com making sure that the data collected was valid and legitimate before actually using the information in the data project. Overall, the experience prepped us for our upcoming projects giving us skills in python if we are to employ the program language for our individual projects and helping us see the importance of the use of online databases.