Our analysis of Hackathon Projects: Part 2
In Part 1, we discussed the basic project trends of hackathon data, how we obtained raw data and preprocessed it. In this part, we’re showing the findings obtained from the locations of participants across the US.
Data: We had data of around 45000 users. Of those, 27000 shared their locations and from those, we found that around 17500 were from US. We also found the participants location in various hackathons(PennApps, MHacks, CalHacks, HackGT etc.). We plotted the user location and participants number by state. The CSV for the participants and winners across different states is here. The interactive map for the same can also be seen here.
Observations: Here are some of our observations. Although the location of all users wasn’t available, the data provided some interesting patterns.
- The hackathon winners and participants were maximum in California. But their distribution was concentrated around San Francisco and Los Angeles and Seattle area. In the east coast, the numbers were less but they were highly distributed. This factor can be because of larger concentration of colleges across east coast.
2. Another observation was less participation from neighboring states in some major hackathons even if they provided some form of travel reimbursement. Like in CalHacks, there was more participation from east coast than from Arizona which has big colleges like University of Arizona and Arizona State University.
We observed the same pattern in HackGT( Georgia Tech). We found less participants from neighboring states.
Participation by state: We looked for trends like participation by state, winners by state and winning ratio by state(minimum 100 participants)
The complete list of map links can be seen here. In the next part, we’ll discuss the impact of team members and their experience in winning the hackathon. We’ll also discuss about discriminating tags and how we used a classifier on the sampled projects data.