Machine Learning for Road Condition Analysis Part 2: “When Data Attacks”

Published in

Frontier Tech Hub

12 min readJun 18, 2020

In the first blog of this series I talked about the importance of local partnerships on Development Projects. It’s all too easy, especially in AI and tech-heavy programmes, to arrive with a “we know best” attitude, only to make errors by not knowing the lie of the land. On the Z-Roads research project, investigating if road conditions can be automatically surveyed using AI and drone imagery, the N/LAB team had arrived prepared — a well planned schedule; well formulated deliverable dates; a range of quality checks. Sat in the offices of our East African partners, they looked at the plan, laughed and pointed out the window. “Survey now and survey quick” was the message. The rains were coming.

A flood of data challenges

With the ball firmly in our court, the N/LAB survey team reacted admirably, stepping up to the plate and performed a data-collection slam-dunk. In rapid time they gave us 700km of “ground-truth” road sensor data, and me a shameful excuse to crowbar three sporting metaphors into a single sentence. Confidence clearly reigned high .

And at that point it seemed justified — we had our survey data; we’d prepped our AI models in parallel (a “transfer learning” approach, using pre-learnt deep learning architectures); drone imagery was on its way. Everything was on track.

Little did we know what was ahead. The Z-Roads project, carefully designed as it was, was about to hit the realities of working with real data in a real development context.

A seemingly endless flood of data challenges lay ahead: missing data, vanishing hard drives, warped drone imagery, misaligned road networks, inappropriate road condition schema…

You always expect data wrangling. But perhaps not the “open warfare” that ensued on the Z-roads project. We survived, and the data battle was eventually won, but the digital scars remain. Below I retell some of this saga, and some of the lessons developing AI in East Africa brings. One mantra underpins many of these lessons:

In development projects, assume nothing about your datasets. Ever.

Lesson 1: Data often goes missing in action

The first battle faced by the Z-Roads project was, to be honest, a simple one. The project data didn’t turn up.

A simple yet delicate problem! You see all “supervised learning” AI projects hinge on their “training” data. This is data used to teach the AI model exactly what it should be looking for. For Z-Roads, this was Drone Imagery of the Zanzibar extent — vast tiles from which we would extract smaller road-segment images and attach road condition scores to them (measured on our survey). These data are fed into machine learning algorithms for it to learn patterns from.

But this key project data was apparently missing in action.

No one seemed to have any idea where this data was, or how long it would take to turn up. This is not normal for an AI project, where transfer of core “training” datasets are traditionally quite straight-forward. Often you can just grab it raw CSV form from one of the many online data repositories; or perhaps you get sent a carefully formatted Hadoop via SFTP from your affluent project partners in London; Or maybe you just generate it yourself from simulations or reinforcement learning strategies. Easy.

That’s the stuff of dreams for a real-world development project. In developing contexts, datasets are likely volatile, noisy, and sparse. Pieced together from different time points, thanks to the hard efforts of many different local surveyors perhaps. But a jigsaw nonetheless. And this was the situation here.

We learnt about one month in, that the raw drone data that was supposed to feed into the project had some serious technical issues. Oversights made during its collection, meant the key process of “stitching” images from individual flights together was far more challenging than normal. And the supplier tasked with this job had completely under-estimated how hard it would be. This wasn’t the neat data they were used to — it was gritty, messy, and collected by dedicated but less-experienced enthusiasts, working in challenging conditions.

Worse still, the challenges of communication across continents, so common on International Development projects, meant everyone was left in the dark as to what was wrong. Soon, 3 months of frustrating endeavour trying to track the problems down had gone by, only for us to find out that supplier had now gone under. Alpha-go never had this problem.

Frantic triage. And then finally after months and months of to-and-fro (and much hard work on the part of the World Bank, DfID and FTL trying to do damage limitation), a “stitched” dataset of image tiles thankfully emerged. Finally, after seemingly endless email chains, we could get back on track.

Our imagery dataset was ready and waiting in Dar es Salaam... Now we just had to get our hands on it.

Lesson 2: Until you see the data, it doesn’t exist.

Immediately we set about organising an SFTP transfer. Despite the best efforts of local IT folks, this failed repeatedly, wasting further days. No problem — its understandable given the context (infrastructure to wire datasets of this scale in East Africa simply isn’t mature enough yet). The solution was to just courier the data to our lab. Not ideal in terms of carbon footprint, but needs must given the delays. And soon a hard drive was, thankfully, whisking its way half-way across the world!

If only the right address had been put on the package by the local team.

To be fair, the hard-drive did admirably well for the first 7000 or so miles of its transit! It was only last few in the UK it stumbled when, lacking a final destination, some hard-working post-office employee had to send it right back again, like some data-laden Tanzanian homing pigeon.

After the months of delay we’d already encountered, there was simply no longer any choice. A Z-Roads team member got on a plane, left the UK, landed in Tanzania the next day, and copied the data onto our media by hand. Perhaps we should have done it at the outset. Perhaps we should have demanded to do the raw imagery processing ourselves. Either way the lesson has been clear — never underestimate the logistical challenges of working in international development, and get control of the data you need for your AI project yourself, as early as humanly possible.

Lesson 3: Drone Imagery data is a dangerous foe

Okay, so after months of frustration, and GANNT charts being ritually burnt as they went out of date, we finally had the imagery data in our possession. Champagne all round. We were off.

Except we weren’t.

Within hours of getting it back to N/LAB it became clear that the newly “processed” drone imagery data had major problems. Before I recant these technical challenges, I think its important to set the scene: first, the Zanzibar Mapping Initiative that collected the raw drone data is incredible. Run by State University Zanzibar, and supported by the World Bank and COSTECH, the project has produced some of the highest resolution imagery data available anywhere in the world. It is used to make advances in mapping, navigation, building counts — all sorts in fact (and is publicly available here, so you should check it out). Simply put, the data looks great and it’s a super program.

But the data was not aimed at road condition AI. In fact it was far from what was needed. The problem for us was fidelity. Drones are not satellites. Satellites sail serenely through the virtual vacuum, glamorously taking pin-point snapshots of the earth, gloriously unimpeded and generally spending time making drones jealous. Drones are instead involved in something more akin to guerrilla warfare. Constantly battling with the wind; they tilt and yaw, rise and fall. Every image skewed from a slightly different angle and height, subject to different light, and different GPS accuracy. The result is a dataset that faces a plethora of AI-impeding issues:

Misaligned Data

Most egregious to data preparation is the fact that the raw images recorded in one individual flight will not naturally align with those taken in another (perhaps even an hour later, and covering the same area). This issue makes stitching images from each flight together to make an overall “tile”extremely hard (and often more an ‘art’ than a science).

The problem is, without a professionally orchestrated set of control reference points to work (defined on the ground for each flight), you will never exactly match the true lie of the land. Without control points, yes images can be made to look beautiful — but no stitching efforts will make them match reality. Image areas will be geometrically skewed, tiles will misalign, despite both thinking they are the ‘true’ rendering of the land, as shown in Figure 1 below:

Figure 1. Example of inconsistencies across tiles, that mean no road network dataset can align perfectly with all sections of the data without intensive post processing. LEFT — misalignment of tiles due to different flight nadir (angle of camera) RIGHT — overlapping tiles with pixel “disagreement” (in white)

The data Z-roads received suffered from this problem significantly. Misalignment was cropping up everywhere , rendering roads disconnected and shifted from the real ground truth. Tiles that themselves overlapped turned out to have completely different opinions of what pixels should be where — the white areas in the above image signifies the areas where pixel content turned out to be “completely up for debate” due to the lack of georeferencing. Establishing which tile was right in these areas was impossible. And quite frankly it didn’t matter, as the issue of “warping” was even more serious…

Warped Image Data

For many applications slight misalignment isn’t a problem. For Z-roads analyses, when you’re trying to extract the pixels along a 5m wide road it is a serious issue. But its even worse when these shifts in alignment are happening all over tiles, and you can’t even detect them. You see, when imagery is taken from different flight angles (or ‘nadir’), it is really hard to pinpoint actual locations without georeferencing. The two images in Figure 2 below illustrate how bad this can be, showing two drone images (taken by drone deploy) of the same area, but taken at different times. Features in the landscape that could be used as anchor points for stitches… appear to be in different places.

Figure 2: Drone Deploy images that show three identifiable visual features that help image processing software to stitch data to another adjacent image. However, the same area, recorded on a different flight run, shows just how much the nadir, height and angle of flight can shift perception of the underlying land. When images of different angles are “stitched” together you are left with a seemingly“warped” image tile.

The issue is simply due to flight vectors, wind, GPS vagaries and nadir differing for each flight - but are incredibly problematic. This is because as images are stitched into tiles, unless you can lock to “true” georeferenced points, you can’t tell where any pixel should exactly be. This results in a “warping” effect in the processed tile, with shifts occurring in different directions inconsistently across a tile.

There is no way a single translation can fix this problem. And once this warping is embedded in your tiles, its not only really hard to even see it, but it lays waste to analysis. Even a 10m shift of a section of a image tile from reality means an algorithm trying to extract roads can miss them completely. Without unfeasibly time-intensive human supervision, your AI can end up assessing the road condition of a row of hedges.

Blurred and Overexposed Data

Once we’d recognised the warping problem, in the weeks that followed piloting AI models, we also started to notice just how badly blurred and over-exposed many roads were. These are artefacts that are hard to avoid in image capture, without a good-deal of piloting experience. And this is experience which is often less available in development projects.

The first image in Figure 3 below highlights how easy it is for blurring to occur when a drone is flown too quickly. Almost all data useful to an AI is lost. The middle image shows it can get even worse if the pilot uses inappropriate camera settings, with features such as roads becoming completely overexposed. Both situations deteriorate images to the point that all information relevant to road conditions is lost (contrast this to the final image from the data which is correctly captured, and has clear road conditions to be analysed).

Fig 3. Two examples of the highly common problems occurring within the drone imagery: LEFT — high blurring MIDDLE — over-exposure RIGHT — a high quality, usable comparison, that did have visible road conditions.

The result is a dataset that is so far from “AI” ready that it is almost impossible to work with. Garbage in. Garbage out.

An old fashioned solution

These distinct imagery problems unfortunately were identified over several months. We’d worked with drone imagery before, but not with these pixel-level challenges. A frustrating battle, made worse that problems we encountered seemed obvious in hindsight. But we had 70,000 image segments in our system, and the lack of fidelity was hitherto unknown. These things are not obvious when your data is just so large, you can’t just ‘look’ at it.

We eventually bit the bullet and setup a manual quality control process. Over the next two weeks a sub-sample of 15,000 tiles (from the project’s original total of 70k) was filtered for fidelity — painstakingly by hand. The task was laborious even with the dedicated interface the team made. And it was powered pretty much thanks to a significant supply of mini-cheddars and halves of porter. But in the end the Z-roads team crafted together approximately 5000 usable image segments for the project (and ironically a labelled dataset that would allow an AI layer to take up this task in the future!). While not ideal, we could at least make headway to our AI goals, albeit with a diminished dataset nevertheless.

In all development projects you’re probably going to be facing secondary data of this type. Our drone imagery inputs were generated by volunteers doing the best they could, and somehow cobbled together without geo-reference points. The fact that the data exists at all is incredible. However, if we’d been able to demand access earlier, perhaps even getting involved in its processing, we could have saved months of delays (perhaps even having chance to interpolate with satellite data, to fix the warping).

Really do get control of your data as early as you can in development research.

Lesson 4: Road Network data is just as bad

Of course it didn’t end there. The next problem hit quick. When you’ve spent months fighting drone data, because it is ‘warped’ to the extent it only vaguely matches actual ground truth, it’s somewhat disheartening to then find out that the Road Network vector data, which you will use to ‘cut out’ road segments from images, is also wrong. We had not one but two datasets that were misaligned with reality.

But this just reiterates the point — data in development projects will be noisy and volatile. In the UK, the road vector data generated by ordnance survey is pin-point accurate (a consequence, I assume, of being setup to pin-point french artillery over two centuries ago). Open Street map data in most western countries echoes this sort of accuracy, verified and constantly iterated by an army of smart-phone bearing surveyors. This is not the case in East Africa. The problem with our (again ‘secondary’) road network data is that the mappers who created it used too few points, at insufficient GPS accuracy. This means road vectors which (again) shift inconsistently away from reality, and many a bend chopped off completely due to insufficient sampling points:

Figure 4: LEFT: OSM road network data (dotted yellow) often consistently ‘misses’ the ground-truth of where a road physically is. RIGHT: Unfortunately when when the network data is aligned with rods, low sampling of OSM digitization (yellow) means that corners are often lost.

Once again, we had no choice but to fix things by hand. Two different datasets, both mismatches with reality… but in different way.

We ended up hand tracing all roads on the drone images. All 700km at 30cm resolution. This required development of an interface to undertake, yet more mini-cheddars, and application of skills developed in a wasted youth hunched over etch-o-sketch. But once traced, the resulting vectors could then be computationally aligned with road network vectors, and we would be guaranteed of accurate road segment images.

We had our data. Diminished in size, yes. Painfully generated, true. But after half a year of delays, we could at least start development of our AI.

In conclusion:

There were in fact, many more data challenges that Z-roads faced but this (already long) blog is too short too cover them. Rampant foliage obfuscating roads is a major challenge. Rainy seasons generating mud and transient visual artefacts that you have to account for. A single large rock in the road that the survey car hit making road segments appear far more bumpier than they actually were. And don’t even ask on the many images we found out late in the day had been down-sampled without our knowledge to protect “sensitive sites”. The challenges are seemingly endless.

But in the end you do get there. And it’s important to persevere, as otherwise all the advances AI can make just benefit the people who probably need them the least — us.

And next time it won’t be so bad. Because next time we’ll assume nothing about the datasets incoming to an AI project. Ever.

The final blog in this series can be found here: Machine Learning for Road Condition Analysis Part 3: “No-surrender Deep Learning”