Looking at data flows to reduce resource consumption

by Camille Maumet | A spotlight on floWeaver, a 2018 Global Sprint project

Mozilla Open Leaders
Read, Write, Participate
4 min readMay 8, 2018

--

Rick Lupton (@ricklupton) is a research associate with the Use Less group at the University of Cambridge where he looks at how resources and materials are used throughout society. By understanding how resources are consumed, he wants to help us use less and reduce our impact on the environment. Rick was selected to join the current round of Mozilla Open Leaders with his project floWeaver, which creates visualizations for flow data.

I interviewed Rick Lupton to learn more about floWeaver and how you can help at the Mozilla’s Global Sprint 2018.

What are “flow data”?

Many kinds of data can be thought of as “flows”: energy and materials moving through industry, money flowing through the economy, telephone lines moving between providers, voters moving between parties, even the target of tweets. Most of my research is concerned with actual physical flows of energy and carbon-intensive materials such as steel and cement, but floWeaver can be used to visualise anything that can be thought of as flowing from one place to another.

What is floWeaver?

floWeaver’s aim is to provide open tools for working with and visualising flow data. It consists of a Python library for data processing, a d3 JavaScript library for creating Sankey diagram visualisations, and Jupyter widgets that let you use it all together for interactive data analysis in Jupyter notebooks. We’re working on defining an open data format for flow data to make it easy to exchange data between different software and grow the ecosystem of open tools.

Why is it useful to visualize flow data? What is a Sankey diagram?

A Sankey diagram is a kind of flow diagram with thick arrows, where the width of the arrow represents the magnitude of the flow. They have been used for over 100 years to identify inefficiencies and find opportunities to use resources more efficiently. I find them useful because they give a great overview of a system to show which potential changes are big enough to matter in the big picture. For example, this diagram shows global greenhouse gas emissions broken down to show which sectors they are associated with, the technical device that was involved, and so on.

Why did you start floWeaver?

In my research group we draw a lot of Sankey diagrams to visualise our work, so it made sense to build some tools to make this easier. It started with a research project working with a steelmaking company to understand how scrap could be avoided during manufacturing steel parts, such as car doors. When you cut out a steel part from a roll, you get offcuts round the edges, just like when you cut shapes out of cookie dough or pastry, and we wanted to find out how much of the offcuts could be reused to make smaller parts. To do this, the steel type, thickness, coating, and so on, have to be compatible, but we didn’t have any way of visualising these attributes within the context of the steel production chain. To solve this I developed what I called a “hybrid” Sankey diagram to show these details, and floWeaver grew out of that.

What challenges have you faced working on this project?

There are technical challenges, such as doing automatic graph layout of the diagrams, and calculation speed for large datasets. We’re defining a data format for exporting the initial automatic-layout Sankey diagrams into other tools for tweaking the layout, but keeping them all working together as the format has evolved is a nightmare. I’m looking forwards to when we’ve got a draft we’re happy with and everything will just work!

As the leader of the project, there have been challenges in supporting more users as the project grows, and making sure to communicate effectively with and involve the wider range of people who could benefit from it, beyond my research group. The Mozilla Open Leaders programme has been very helpful in this process.

What kind of skills do I need to help you?

There are lots of ways to contribute to floWeaver, needing a range of skills:

  • We are curating a gallery of examples of how people are using the project to all kinds of data — bring your own data and give it a go!
  • Gathering feedback on what’s easy to use and what needs improving & documenting.
  • Specifying a JSON-based open data format for interoperability between Sankey diagram tools, writing converter scripts and prototyping a web-based Sankey editor.
  • Improving the graph layout algorithms and visualisation.
  • Profiling and optimising the Python code to work faster on larger datasets.

I did have “creating a logo for the project” on the list, but someone has kindly already volunteered to do that!

How can others join your project at #mozsprint 2018?

Have a look at the ”what shall I do?!” board for ideas, and see CONTRIBUTING.md on GitHub for more details. If you have any questions about how to join in please open a GitHub issue or ask on Gitter chat.

Is there anything else you’d like to add?

I’m looking forward to taking part in my first Mozilla Global Sprint and seeing what comes out of it!

Join us wherever you are May 10–11 at Mozilla’s Global Sprint to work on many amazing open projects! Join a diverse network of scientists, educators, artists, engineers and others in person and online to hack and build projects for a health Internet. Register today

This post by Camille Maumet is licensed under a Creative Commons Attribution 4.0 International License.

--

--

Mozilla Open Leaders
Read, Write, Participate

A cohort of Open Leaders fueling the #internethealth movement through mentorship & training on working open. Work Open, Lead Open #WOLO mzl.la/openleaders