Science, Data, and Creativity — The Trio of Modern Problem Solving. Planet OS Datathon

This past Earth Day Weekend, a group of more than one hundred students and young professionals joined forces to apply climate and environmental data towards addressing increasingly pressing climate concerns.

This two-day Datathon was hosted at UC Berkeley’s Haas School of Business and united people across the Bay in order to explore global environmental problems and solutions via data-driven visual storytelling.

Organized by a joint effort of Intertrust and The AMENA Center for Entrepreneurship and Development of UC Berkeley, the Datathon focused on using Earth Data to create data-driven stories to address climate vulnerabilities and propose innovations in the AMENA region (Asia, Middle East, and North Africa).

Introductory Remarks by Dr. Samir Raouf (Iraq’s Former Deputy Minister of Science and Technology)

From the Datahub to the Datathon

The information we gather on the proceedings of the planet is growing exponentially, from smart homes gauging localized temperatures to super-resolution satellites quantifying the physical environment. With so much information flowing through different sources and industries, how much of the collected data becomes actionable, especially considering scale and cross-domain synergy between various data sources? My guess is less than 1%.

One of the main issues we face with trying to apply environmental data is that it often becomes “siloed” within the niche communities and data sources.

Planet OS has been at the forefront of the Earth data aggregation and distribution, and as our team members agree, we barely even scratch the surface of opportunities that data can provide. Our goal is to make data actionable by making it accessible.

Through bringing together young creative professionals, experienced mentors, and domain experts, we aimed to generate a synergy that could multiply efforts and lead to new, original solutions and insights for the AMENA region.

Considering the limited time and technical challenges of working with the data, our expectations were quite humble. But in the end, the judges were amazed and surprised by the level of the delivered content and the various teams’ thoroughness in following goals and guidelines of the Datathon.

Every team used original data analysis and fused their analysis with existing research & analytics. Presentations were very well executed from both accessibility and aesthetical standpoints.

While teaching and mentoring at the Datathon, it became clear to me that geospatial data takes tremendous effort to analyze and compile, even when using powerful Python-based toolkits like NumPy, Xarray, Pandas. I think everybody found “blank spots” in their knowledge and succeeded in overcoming them via collaboration, making this event a productive learning experience.

First place winner:

Team #1: Decreasing Household Water Scarcity in Jordan with Solar Water Pumps and Storage Tanks

Team #1 Presentation

Fusing six variables sourced from various datasets to get insights about the most suitable places where to build water infrastructure is an impressive feat.

Team #1 found geospatial data about aqueducts, water consumption, and a population density which they combined and analyzed with satellite observations from the GRACE dataset. With this, they built a model to understand groundwater potential and integrated this with World Bank solar potential information to figure out the best locations for building solar-powered water pumps.

Additionally, the team wanted to build upon addressing drought through pumping underground water efficiently is not a complete solution, so they went further and also analyzed regional historical precipitation records to figure out the best locations for capturing rainwater.

Slide 11. Solar Water Pump Locations Recommendations
Slide 14. Water Storage Tanks Location Recommendations

As you can see at the slides, there are certain locations where main metrics align highlighting areas of interest for future research. The results are preliminary so far, and advancing this analysis to a decision-making stage would require further effort — validating assumptions, including more local observations and other auxiliary data. However, this creative data-driven approach and cross-dataset synergy are extremely useful in providing hints and insights for the next generation of resilient infrastructure and policy.

Lessons learned

  • There is no such thing as being overprepared. We provided some technical documentation and examples ahead of the event, but not every participant had the time to try it. Next time we plan to provide more video content (demos and screencasts), so it would be easier to learn while participating in the event.
  • Cater to less technical participants. There are many people with domain expertise rather than computational skills, and while some teams were balanced on technical and non-technical members, others were a bit understaffed. For the next event, we plan to generate some static data in the form of CSV files and pre-rendered maps, so people who are not fully comfortable working with the data would be able to use the Datahub with more ease.
  • Next time, we will also consider giving a longer time frame for the project completion to take on more ambitious problems. It still amazes me how deep participants went in just 12–15 hours.

Special Thanks To:

  • Guest Speaker: Dr. Samir Raouf
  • Team Berkeley: Ayushi Gupta, Imanne Chaudhry, Simran Regmi, Eleanor Sobottka, and Saeed Nassef
  • Team Intertrust: Ambriel Pouncy, Zaki Alattar, Chase Walz, Carolina McClanahan, Brandi Firestine, Anthony Trejo, and Jean Michel Bosch
  • Mentors: Kristian Paljasma, Samiyeh Mahmoudian, Beto De Almeida
  • Technical support: Eneli Toodu, Andres Luhamaa

Next blog post will cover 2nd and 3rd place winners of Planet OS Datathon. Stay tuned!

If you’d like to be notified when new content becomes available, follow Planet OS on Facebook and Twitter or subscribe to the Planet OS newsletter.