HOL #1 REFLECTION
The team I am part of is made up of four members, myself, Mike DeAnda, Xi Rao and Bing Xue. My personal opinion is that we work very well together. Each one of us brings skill sets that assist with moving our project forward. If my math is correct I believe we have been working on our project for approximately five weeks. Our initial strategy was posted on Medium and each week we continue to build upon that report.
The past two weeks we have worked on parsing our data by writing a code in python to identify the #1reasonwhy from all other hashtags included in our raw data. This is a necessary first step for us to move forward with our project because the only tweets we are interested in analyzing are #1reasonwhy. Xi has been the project lead during this portion of the project. It was her effort and python script that provided the code to open the files to filter and curate the data. I found this code to be inspiring because what we had laid out in our project plan was starting to come to fruition and our team was using python.
The next step was for our team to work together on the remaining code that we wrote and to make a few modifications to the parsing code. As a team we used Collabedit.com to work together in real time on the remaining script. Xi continued to serve as the lead on our project and Mike helped to provide additional insight when we really got stuck on writing our code. Bing asked tough questions that made our team consider our code and the steps we were taking very closely. Her questions helped us to confirm our decisions and provided us with confidence that we had thought seriously about the code we were writing based on our current knowledge of python.
Now that we had the initial code to filter the data we needed to rename some of the API twitter terms to terms that we could understand, for example, we renamed “preferredUsername” to “handle”, and the “id” found in the “actor” portion of the tweet to “authorId.” We needed to do this to make it simple for us to understand the curated data, because for example, there was more than one “id” tag within a tweet in the raw data. This would get extremely confusing if we did not identify which “id” tag we were using, which is why we renamed the one we wanted extracted for our analysis to “authorId.” This part of the code is located in the fetchData function along with additional tags that we renamed, an empty list that we named usermentionsStr and an empty dictionary that we named mentionsObject.
The next steps that we took were to write the code to generate the hastag string, to write the code to generate the usermentions string, and to create the schema to store our data. As a team we all contributed to writing the code for various portions of our script and there is a lot more code to write before this project is completed. Tonight we laid out a project plan identifying the various steps needed for us to complete the final project by the end of the semester and those steps consist of a lot more python in the near future for my team. I am looking forward to it because I would like to become better at identifying what specific code needs to be written in order to get the outcomes I need to conduct research. This will only become natural to me if I continue to practice python and discuss it with my team so that my understanding becomes stronger.
If I were to do a similar project I would spend more time on practicing python. I would like to be at a point that when I code it becomes natural for me to write it. Currently, I really have to think about whether or not it should be a string, should I make a comment, when to use a list versus a dictionary, and how to make sure that I include everything I need within the schema. I am glad that I had my team to work with on this first python project because it would have taken me a lot longer than it did working within my team. When I didn’t understand a step that we were doing I was able to rely on my team to talk me through it so that I did understand it. They are serious about the project and accomplishing the goals that we have set out to accomplish but they are also very understanding and fun to work with.
The most challenging part of working with my team is really a problem I have with myself. I usually do not like to ask for help. It is usually me who provides assistance to others. One of the challenging parts of the project as a whole is that I am very much out of my comfort zone. I don’t mind being out of my comfort zone but this class has had me a little more outside my comfort zone that at times I was nervous that I might not be able to accomplish an assignment, somehow I have accomplished each assignment but looking back I’m not sure how I did that. I feel good about myself when an assignment has been completed but I know not too far in the future a new difficult challenge will be thrown my way. What I am happy about is that each difficult assignment builds to the next challenging assignment. Libby really does have a strategic method that is leading me to a new level of learning within the digital humanities and data world. My curiosity about data and big data has been peaked. I find myself reading and viewing a lot more articles about data. Last night I watched a TED video of Tim Berners-Lee the creator of the World Wide Web and what he wants to see happen in the future of the web with making data available to everyone and that “data is relationships!” The various readings that we read for class and the dataset assignment proved to me that “data is relationships.” There is so much analysis that can be conducted on datasets and it all depends on how the researcher wishes to analyze and extract information from it. Our world would be a better place if data was more transparent and made available for researchers, hobbyists, etc. to utilize to help solve some of the world’s biggest challenges/problems. There is someone out there that can cure a specific disease if he/she had access to the data to help them develop the solution. I never realized how valuable data was until this class made me break it down so that I could understand it better.
The dataset assignment helped me to realize that it is important to provide a read.me file so that anyone using my data could have a quick snapshot of what my data was, how it was obtained, who had authorship of it, terms and conditions of use, type of file it was formatted in, etc. This is extremely important to my peers because it will save him/her time on deciding if they wished to use my data or not. I also found a lot of value in codecademy’s python tutorials and continue to utilize them to brush up on my python coding.
I genuinely enjoyed this assignment but at times it was extremely challenging. I am glad that I have endured these last two weeks but I would not have been able to get this far without the support of my team. This project would have taken me much longer to complete if it were not for my team, our discussions and our working together. I know that the next few weeks will be just as challenging but I will take each week one at a time and before I know it the end of the semester will be here and our project will have been completed and submitted. I will value this class and semester as the one that introduced me to the world of coding, big data, open source and digital humanities.