Overscripted Web: a Mozilla Data Analysis Challenge

Help us explore the unseen JavaScript and what this means for the Web

Photo by Markus Spiske on Unsplash

What happens while you are browsing the Web? Mozilla wants to invite data and computer scientists, students and interested communities to join the “Overscripted Web: a Data Analysis Challenge”, and help explore JavaScript running in browsers and what this means for users. We gathered a rich dataset and we are looking for exciting new observations, patterns and research findings that help to better understand the Web. We want to bring the winners to speak at MozFest, our annual festival for the open Internet held in London.

The Dataset

Two cohorts of Canadian Undergraduate interns worked on data collection and subsequent analysis. The Mozilla Systems Research Group is now open sourcing a dataset of publicly available information that was collected by a Web crawl in November 2017. This dataset is currently being used to help inform product teams at Mozilla. The primary analysis from the students focused on:

  • Session replay analysis: when do websites replay your behavior in the site
  • Eval and dynamically created function calls
  • Cryptojacking: websites using user’s computers to mine cryptocurrencies are mainly video streaming sites

Take a look on Mozilla’s Hacks blog for a longer description of the analysis.

The Data Analysis Challenge

We see great potential in this dataset and believe that our analysis has only scratched the surface of the insights it can offer. We want to empower the community to use this data to better understand what is happening on the Web today, which is why Mozilla’s Systems Research Group and Open Innovation team partnered together to launch this challenge.

We have looked at how other organizations enable and speed up scientific discoveries through collaboratively analyzing large datasets. We’d love to follow this exploratory path: We want to encourage the crowd to think outside the proverbial box, get creative, get under the surface. We hope participants get excited to dig into the JavaScript executions data and come up with new observations, patterns, research findings.

To guide thinking, we’re dividing the Challenge into three categories:

  1. Tracking and Privacy
  2. Web Technologies and the Shape of the Modern Web
  3. Equality, Neutrality, and Law

You will find all of the necessary information to join on the Challenge website. The submissions will close on August 31st and the winners will be announced on September 14th. We will bring the winners of the best three analyses (one per category) to MozFest, the world’s leading festival for the open internet movement, taking place in London from October 26th to the 28th 2018. We will cover their airfare, hotel, admission/registration, and if necessary visa fees in accordance to the official rules. We may also invite the winners to do 15-minute presentations of their findings.

We are looking forward to the diverse and innovative approaches from the data science community and we want to specifically encourage young data scientists and students to take a stab at this dataset. It could be the basis for your final university project and analyzing it can grow your data science skills and build your resumé (and GitHub profile!). The Web gets more complex by the minute, keeping it safe and open can only happen if we work together. Join us!