Data Science Master’s Students Build Tool to Protect the Ocean by Quickly Detecting Oil Spills

Ranadip Acharya, Snithika Reddy Kalakoti, Meng-Kang Kao, KT Norton, and Sameer Shinde are the team behind the Spring 2024 Hal R. Varian Award-winning project “OceanWatch”

Berkeley I School
BerkeleyISchool
7 min readJun 6, 2024

--

OceanWatch, the Varian Award-winning project by Master of Information and Data Science students Ranadip Acharya, Snithika Reddy Kalakoti, Meng-Kang (Scott) Kao, KT Norton, and Sameer (Sam) Shinde, aims to use radar technology to quickly and accurately detect oil spills in the ocean. We interviewed the team to learn more —

Photo courtesy of Scott (Meng-Kang) Kao

What inspired your project?

Meng-Kang: I grew up in Taiwan, an island with gorgeous coastline scenery and marine ecosystem. At the same time, its geological location makes it a very popular marine traffic route. Sadly, this proximity occasionally results in devastating oil spills, tarnishing the coastline and challenging cleanup efforts.

In December 2023, just prior to my capstone project started, yet another oil spill incident happened. I was shocked to find that while the local government tried their best to quickly respond, there was no monitoring and alert mechanism being utilized.

During my DATASCI 221: Modern Data Application class, I came across a paper utilizing satellite imagery and machine learning. I started to wonder, can such technology be used for oil spill detection?

As our team conducted market research, we all reached the same conclusion that current oil spill detection solutions are costly and manual. KT worked her magic to connect us with a UK government agency to discuss this matter. This affirmed our belief that our machine learning approach can make a positive impact.

What was the timeline or process like from concept to final project?

Snithika: Our journey with OceanWatch was a testament to our team’s cohesion and commitment. Assembling quickly, we wasted no time diving into research, meticulously reviewing literature and exploring available datasets right away.

By week 5, our vision for the project crystallized, and the name and the purpose of Project OceanWatch became clear: to combat oil spills and protect our oceans.

Week 6 brought clarity as we refined our approach, ensuring feasibility within our timeline — leading us to select a proof of concept area for oil spill detection (the English Channel) , defining our target audience (smaller govt agencies without robust monitoring services), subject matter expert (SME) interviews with agencies that had some form of monitoring and scoping our minimal viable product.

Once our minimum viable product (MVP) design was scoped, Week 8 marked the consolidation of our tech stack. We dedicated our efforts to data engineering, meticulously evaluating and selecting models, and embarking on live website and feature development. Additionally, we conducted ongoing market research, integrating additional features like real-time email/text notifications for oil spill detection. These enhancements aimed to distinguish our project and position it as a smart solution for swift and efficient oil spill detection.

Furthermore, OceanWatch was operational by week 11, but our work was far from over as we continued seeking feedback from SMEs and government agencies in this space.

Our journey wasn’t just about data; it was about real-world problem-solving and making a difference in ocean conservation, therefore emerging with a practical tool for environmental protection in 12 weeks.

Our journey wasn’t just about data; it was about real-world problem-solving and making a difference in ocean conservation.

How did you work as a team? How did you work together as members of an online degree program?

KT: Our team is composed of individuals from diverse backgrounds, making us well-equipped to tackle our capstone project effectively. Sam and Scott, both experienced in project management, devised a comprehensive project plan for the semester and kept our team on track with deadlines and deliverables. Ranadip and Snithika, with their strong engineering backgrounds, played crucial roles in building our data pipelines, ensuring the data was properly collected, processed, and prepared for analysis. As a machine learning engineer, I focused on model training and inference, applying advanced techniques to extract valuable insights from our data.

Despite our individual strengths, we fostered a collaborative environment where each team member was willing and able to contribute to all parts of the project. This teamwork ensured that we could support each other whenever challenges arose, leveraging our collective expertise to find solutions.

Participating in an online degree program, where many students balance full-time jobs, presents a unique challenge in executing a successful capstone project. Each MIDS class promoted collaboration through group projects, in-class breakout rooms, and interactive office hours. These opportunities allowed us to build strong working relationships and maintain effective communication, which were essential for our project’s success.

How did your I School curriculum help prepare you for this project?

Sameer: The MIDS curriculum was instrumental in preparing us and instilling the confidence needed to execute our ambitious capstone project. Each course was relevant and essential in building a strong foundation and specialized knowledge. While all courses contributed significantly, the following courses were particularly vital to our success:

  • Research Design and Application for Data and Analysis (201): This course was fundamental in honing our research skills. It provided valuable insights into how to effectively design research projects, formulate relevant questions, plan strategically, mitigate cognitive biases, address ethical considerations, and deliver findings through compelling narratives.
  • Applied Machine Learning (207): The course provided a comprehensive understanding of machine learning principles and practices. Learnings around experimental design, feature engineering, network analysis, and developing and evaluating machine learning models were indispensable.
  • Computer Vision (281): A critical component of our project was understanding and implementing convolutional neural networks (CNNs), which are pivotal in image processing tasks. This course thoroughly covered CNN mechanisms, allowing us to select and evaluate models that were best suited for analyzing satellite imagery. The hands-on experience with CNNs facilitated our ability to build and refine models that could accurately detect oil spills from satellite images.
  • Modern Data Applications (290): The diverse case studies presented in this course broadened our perspective on the various applications of data science. Exposure to numerous real-world scenarios sparked innovative ideas, including the use of synthetic aperture radar (SAR) images for our project. Learning from different case studies helped us think creatively about solving problems and inspired us to apply unconventional data sources and methods in our project.
OceanWatch demo video

Do you have any future plans for the project?

Sameer: The team remains committed to operating this product as a non-profit endeavor while actively seeking funding opportunities to address several areas for improvement. These include enhancing website usability and refining the model to increase accuracy and reduce false positives in oil spill detection.

In our ongoing pursuit of advancing the oil spill detection product (https://oceanwatch.live), we have outlined specific strategies for future development aimed at enhancing accuracy. Firstly, we plan to address false positives by incorporating additional training data, implementing reward-based reinforcement learning techniques, and exploring the integration of contextual data. Secondly, we are keen about leveraging the recently released SARDet-100K dataset to fine-tune our models using pre-trained weights and investigating the multi-stage with filter augmentation (MSFA) pre-training framework. Lastly, we intend to tackle the challenge of quantifying oil spill volume by exploring innovative techniques such as SAR Interferometry and passive microwave remote sensing data. Through these initiatives, we aim to continually enhance the effectiveness and precision of our oil spill detection system.

How could this project make an impact, or, who will it serve?

Ranadip: With over 10,000 billion tonnes of crude oil transported annually, the threat to marine ecosystems is significant. About 5,000 tons of oil enter our oceans each year, harming marine life such as whales, dolphins, and seabirds. Incidents often occur in vital marine protected areas. For instance, the Gulf of Mexico oil spill on April 20, 2010, released 134 million gallons of oil, impacting 2,100 km (1,300 miles) of the U.S. Gulf Coast.

While countries like the UK and USA have satellite imagery solutions for monitoring oil spills, smaller nations and agencies often lack these resources. Recent incidents, like the February 2024 oil spill off Trinidad and Tobago, highlight the urgent need for early detection and response to mitigate impacts on marine ecosystems and coastal communities. Even in wealthier countries, preventive measures often rely on reports from ships or oil rigs, showing gaps in marine protection.

OceanWatch offers a promising solution with a rapid and automated system for detecting oil spills using satellite imagery and machine learning. We retrieve synthetic-aperture radar (SAR) imagery from the Sentinel-1 satellite in near real-time, operating 24/7 regardless of time or weather. Our prediction engine analyzes these images, and potential oil spill forecasts are sent to subscribers via email or text. OceanWatch’s proactive approach ensures timely response efforts.

Additional info to share?

Ranadip: What sets OceanWatch apart is its commitment to inclusivity. Recognizing the resource constraints faced by smaller countries, OceanWatch offers a cost-effective solution that relies on publicly available satellite data and also intends to operate as a non-profit. The cost of this service is entirely limited to the cloud cost associated with running it, and additionally, we were able to use the cheapest AWS EC2 hardware due to the fast inference speed of our model. We hope this will facilitate precise response efforts, ultimately contributing to the preservation and protection of invaluable marine ecosystems and livelihood of coastal communities, which is previously unheard of.

Our team is revolutionizing oil spill detection with real-time monitoring that allows for early detection and rapid response to safeguard marine ecosystems and coastal communities. Join us in creating a cleaner, safer ocean for generations to come.

Meng-Kang: I am really grateful that we have a devoted team. Sam and I decided to work together during our August 2023 immersion. Although we didn’t know exactly what we wanted to work on, we both planned to take only capstone class in the final semester. When the team was finalized in week 2, all five of us shared the same commitment to push ourselves to the limit and see how far we can go. This level of motivation made our capstone experience extremely enjoyable.

Photo by Noah Berger for the School of Information

--

--

Berkeley I School
BerkeleyISchool

The UC Berkeley School of Information is a multi-disciplinary program devoted to enhancing the accessibility, usability, credibility & security of information.