In the U.K 20% of all drinkable water is lost and wasted during distribution.

How can we leverage modern technology and big data to help reduce this unnecessary water loss? Or, how to go to a Hackathon lose team members, as well as your mind and still be a #Winner.

Jamie Heuze
Aug 2, 2017 · 10 min read

To figure this out Dootrix joined the Northumbrian Water Group (NWG) along with some of the brightest teams from all over the world for an Innovation Festival.

Located at the beautiful Newcastle race course, the festival took place over five days and followed a Google Ventures design sprint. Perfect for us as Dootrix has used GV sprints to answer critical business questions through design, prototyping, and testing ideas with customers for several years now.

Until 10 to 15 years ago, corporations tried to deal with innovation internally. They had huge research and development departments and budgets. But in the modern innovation context, the process of bringing ideas into prototyping and into the market is so quick that you don’t have time to spend years and years with your internal department, waiting for something to happen.

Hans Moller - North East Enterprise Partnership’s Innovation Director.

The festival was split into 6 sprints, each one posed a separate challenge for teams to solve. The challenge we decided to tackle was water leakage. Sprints were filled with data scientists, developers, engineers and designers from some of the biggest organisations, such as; IBM, Microsoft, CGI, Ordnance Survey, BT and Accenture.

Our team from Dootrix consisted of; Paul Glenister Mobile System’s Manager of Clancy Docwra who offered us invaluable insight into the water industry, Charlie Allen our Delivery Director, Adam Hill our software engineer and yours truly, Jamie Heuze, product designer and Head of Design at Dootrix.

The team was however short by one, our data scientist couldn’t make it… Oh, did I mention the water leakage challenge was mostly about, data science? 😱 Not to be defeated so easily we hopped on our plane with a copy of data science for dummies and headed to Newcastle.

On our arrival, the festival was buzzing, we didn’t expect it to be so “creative” but it was and there was an incredible energy in the air! Illustrators creating 10ft posters mapping out customer stories, bands playing, providing an awesome sound track to ideas being shared within crowds. Drinks (Yup! 🍻) were flowing generously, there was even a poetry stage for those who wanted to indulge in a little spoken-word.

At the end of each day, there were camp fire sessions to meet like minded people and the mornings offered yoga, dance and a rather exhilarating lemon & ginger boost juice to shake out the cob webs.

OK, ginger shots done, it was down to business.

We made our way to the water leakage Hackathon, ready to dive into the data and solve everything… Turns out, the datasets were huge and varied (read obscure) so making sense of them was actually going to be our first challenge.

that data can be moulded into something that humans can comprehend and act upon. Adam and Charlie waded into the depths of the data, hammering it into some form of structure. I began to analyse the usefulness of the datasets appearing and explored their relationships. After some time we started to see patterns emerging so, Paul and I worked closely together to apply these theories to real world scenarios.

We conducted some light weight user research, interviewing Dennis Dellow (Network Manager at NWG) and Michael Hull (Performance & Information Team Leader at NWG) to get a better understanding of their daily responsibilities. By using the Jobs to be done framework (JTBD) we were able to define the most valuable insights, to the right people, in the most effective way. After a few failed attempts, pivots and questionable ideas we settled on one which we believed offered the most scope to prototype.

Time to intervene

Our idea, to utilise historical data and other factors to predict a time of when to intervene and check a water networks ageing infrastructure. The initial factors we felt had a major impact on network stability were;

- Age of Pipe
- Soil Type
- Shrink & Swell
- Soil Corrosivity
- Population
- Pipe Type
- Pipe Dimensions
- Materials

Time to bring out the big guns and use modern technology to our advantage! Adam fired up Microsoft’s ML Studio and set out to train a predictive model to output an estimated time to intervene based on our data. It failed, and it failed hard, according to the outputs of our model it was so unsure of its prediction it couldn’t confidently say if it was wrong, right or even if it was meant to be predicting anything at all.

So, Adam rolled up his sleeves and got to work…on convincing the Microsoft team to help us figure out what was going on. To be fair they had a team of evangelists at the festival to teach people how to use ML Studio and this, being a new thing for us, was a perfect excuse to extract all that wonderful knowledge from the experts themselves. Thanks to Olivia Klose (Legend!) from Microsoft, we found that the data needed to be completely cleansed and “shaped” better in order to give the training model explicit instructions.

We cleaned the data but were still not happy with the results. The problem was that 10% of our data represented burst pipes, with a burst year, but the pipes that had not yet burst of course have no value for that year. These null values were not helping. The next step was to only include records of actual bursts. This approach produced results, but all the predictions were in the past, because as far as the ML algorithm could see from the selection of records, “all pipes always burst between 2000 & 2015”, which is clearly not true.

Microsoft recommended taking a predictive maintenance approach, so rather than predicting the year in which the pipe would burst, predict its eventual lifespan. Initially, this had the same problem, but we then had our crucial light bulb moment and thought of a way to utilise the missing 90% of records!

We included the manufacturer’s expected lifespan for each pipe material and made a column which featured the actual life of known burst pipes, or the expected life of a pipe where it hasn’t yet burst. Although still slightly manufactured, given that the expected lifespan is not entirely accurate, the predictive model was outputting believable results and was factoring in the features such as corrosivity of the soil.

We had done it! We wrestled with the data and now had a machine effectively predicting the future (of water mains at least). Now, all we had to do was look at our original user research and figure out how the hell we could present these magic numbers and algorithms to the people that needed to act on them.

We decided the best approach was to surface the data in 4 different ways. This allowed us to focus on the specific JTBD for each one of our use cases.

The initial landing page for the prototype offered the user the next ten most crucial mains that needed to be monitored. We used our machine learning algorithm to automatically assess the network, then we aggregated the effective factors to give each mains a life line, risk rating and an estimated time until failure.

To allow users to manually assess a known water mains, we offered them a path to enter the Mains ID or reference for any mains, this lets them access all of the data and aggregated information provided within insights.

During the testing phase, we discovered something pretty incredible about our data modelling concept. We could actually get most of our data for our effective factors without laying a real pipe. This meant the network could do a detailed cost analysis for a specific location and scenario while allowing them to tweak materials and see its expected life span. The new projection tool would let the user plan and apply the correct infrastructure potentially saving a huge amount of money, time and maintenance.

This feature was included to allow users to look at how a mains at risk can affect others around it over time. We also felt it also gave Execs and managers the opportunity to look at the network overall and assess areas for improvement at a high level, essentially looking at the Bigger picture.


We were around the twelfth team to present our findings to the judges and we had witnessed some seriously clever data science being delivered!
The breadth of knowledge being presented had us a little shaken and we knew that without our data guy to hand, it was pointless trying to pretend.

Our plan was to focus on the user and offer actionable insights that helped the business, not just the scientists, to a better understanding of what was happening across the water network.

With only a 4 minute slot, Charlie kicked off our presentation with a rapid explanation of our journey and how we came to present this specific idea.
I then proceeded to demo the prototype as one of our identified users running through the JBTD scenarios we created.

The 4 minutes went fast! We covered as much as possible, we believed that we could leave the stage with our heads held high, the best we could have done considering the circumstances. There was no chance of us actually winning, presenting data science solutions without any data science! 😂 Applause, stage exit left.

30 minutes went by and we had just about finished packing up, it was a long day so I decided to start chowing down on an apple that I’d forgotten about earlier. Suddenly;

Teams, could you gather round! The judges have come to a decision and are ready to announce our winners!

We would like to award our NWG innovation prize to a team that showed great creativity and for giving us some ideas to really consider moving forward. That team is…

What we learnt

  • Don’t start eating an apple just before they announce the winners. You never know, you might just be surprised!
  • On the ground experience should never be over looked.
    Having Paul on the team really enabled us to pin point what was truly useful in the industry. We couldn’t have done it without you 🙌
  • Data is what you make it.
    We didn’t have all of the clever science but we got the most out of what we had. By wrapping up our work and showing its effectiveness in managing time, money and effort it was something every business owner could relate too.
  • Talk to everyone, even strangers.
    I know, this seems daunting, just don’t get in their car or accept any candy. At Dootrix we try to speak to users as often as possible and we did exactly that in the sprint. Speaking to people from NWG and the other teams helped us create some really great user profiles and work out our important JTBD.
  • Ask for help, it’s great path to being better at what you do.
    Machine learning was a completely new venture for us and it was a daunting task to undertake in just a few days. When we got stuck, we stepped back and reached out to our peers at Microsoft, they happily and skillfully showed us the ropes. Now we know a little more, Thank You!

Whats next?

We have had a lot of interest from the event and our team is continuing to work out how we can help other companies solve their business problems.
If you would like to know how we could help you or your organisation then simply let us know.

  • For the ‘Time To Intervene’ product
    The next steps we see is adding some more valuable data to the algorithm. Adding factors such as pressure, flow, temperature, etc, would really help our model become more accurate. One of the benefits of machine learning is that it exposes holes in your data sets. We now have a better understanding of what types of data we need to collect in the future. This, in turn, can be routed back into our data training model to also increase our prediction accuracy.
  • For water loss in general
    We strongly believe that all natural resources are precious and should be respected. Dootrix HQ is based along the sunny south coast, many of us are surfers and we all love the water.
    In our research, we were saddened to find out that of the 20% of unnecessary water loss, over one-third of that takes place in our homes and on our properties. As a society lucky enough to have clean accessible water lets try and do what we can to not waste it. Little changes to how you think about and use water can save a lot!
    Here are some helpful tips on saving water and money >
    Also if you love water as much as us, show some love and support for Surfers Against Sewage >
  • Plastic Is Killing our Oceans —
    The Issues, Facts, and Possible Solutions

    Approximately 40% of the world’s 7.6 billion people live within 62 miles (100km) of an ocean coast. For the other 60%, some of whom may never have even seen an ocean, the seas still play a vital role in their lives…Read More of this Great Article by Wendy Lipscomb

Thanks for reading,

Jamie. ✌️❤️

The Reading Room

Stimulating writing for both sides of your brain by the…

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight.
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox.
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store