How we combined some machine learning methods for traffic prediction.

Published in

NeshanMaps

6 min readApr 3, 2020

More than 2 years ago we started a traffic estimation service at Rajman Information Structures` R&D team. After a while Neshan Navigator born. We needed a traffic prediction to base our routing and navigation service on it. This story published 2 years ago on hackernoon. Now I want to add some extra work that we did on that service during these 2 years.

As a data scientist, traffic prediction has always been a great challenge for our team. We use a machine learning algorithm for traffic estimation and a navigation system based on our live traffic estimated data. We have built a simple traffic estimation prediction that is used to predict navigation travel time. The output of our services is surprisingly accurate. On top of that, our travel time prediction has shown brilliant results, except in anomalies like vacation days or unpredictable changes in traffic behaviors. These anomalies happen less than 20 percent of the time but our users need more accurate prediction especially in anomalies moments, so here I will talk about the approach we took to solve the problem.

There are many kinds of solutions for traffic prediction like ARIMA, SARIMA, Kalman filter, particle filter, and theoretical traffic propagation methods. But in the real world, when you must respond to your users in less than half a second and you have a huge amount of dirty/missed/imbalanced data, employing these methods does not seem to be working.

So, here is what we did to predict traffic:

As an initial idea, I proposed to use a simple linear algebra-based method to find the most similar traffic state to live traffic. Suppose we found the most similar state to the current state. We assume that 20 minutes later from now, the traffic should be similar to 20 minutes after the state we found as similar. This method had a bunch of errors. For example, when you use a dot product, you can not avoid the effect of an accident in the north-west of a city on the traffic state of the south-east in predicting similar states.

Then we used a method to do a local search in order to find the most similar traffic state in our traffic history. The whole idea is very simple. Cut the city map into some pieces. For example, we used square cells with 200m * 200m dimensions. Then we defined a mesh that consists of 5*5 cells, so its area was 1Km². Each mesh had 9 collections of 3*3 cells as illustrated in the following figure.

The basic idea of building mesh is borrowed from convolution neural networks.

The traffic in this local area will be predicted by its own data. The whole amount of data is huge, on the other hand, we face totally missed and imbalanced data in smaller pieces. What we needed here was feature engineering on our data. I defined some features for all traffic data in each cell and did some matrix operations on each mesh and evaluated the live state of the mesh with its historical states. Finally, I found the best method to find the most similar state is to consider a 25-dimensional space so that each mesh is a 25-element vector. Now we can calculate the dot product between each historical state with live traffic state and find the most similar state very fast by the use of cosine similarity. After finding a similar state, in the next step, we tested our initial idea. We examined if we can consider 20 minutes after that state as a prediction of 20 minutes later traffic from now on. The answer is yes. The predictions are amazingly fast and accurate. Another advantage of this method is the fact that it can predict traffic in a database layer via SQL queries.

How to get rid of the network`s issues that cause an error in traffic predictions? Here we go to the last part of our traffic prediction journey. When we are working on an open street map network graph, it always has issues like mistaken dead ends and some reverse directions. So when you are predicting traffic on this network the results may be incorrect. Assume each state of traffic as an image. We can take an image from each mesh by opening our lens for 10 minutes and then take a shot. Next idea is to create a HOG from this images. HOG is a very simple representation that captures the basic structure of a face in a simple way. For more information on HOG please read this post.

The original image is turned into a HOG representation that captures the major features of the image regardless of image brightness.

We have some traffic movements in our mesh. Each of them has a location and a velocity. This velocity has a direction and a magnitude. So we can define our HOG by using their location and normalizing the magnitude of the velocity for the length of that line on the image. Then we divide the length of each line into 2 equal parts and will use intense white color to represent the vector direction. For instance, if we have a movement from north to south, the top north part of the line in the image will have 2 times more intense than the bottom south part.

Now we converted traffic prediction into a near duplicate image search. If we want to find the most similar image very fast we can create a hash from the image of each state. As I have tested some methods, it seems Locality Sensitive Hashing has very good results in this case. For more information, I recommend reading this post about finding near-duplicate images very fast.

This method can be applied in a variety of situations and can easily predict traffic on any graph network. Even if our information about the network has some errors or enough knowledge does not exist, we can use the last part to predict traffic. Applications of this method can help us to predict website traffic in a black Friday by collecting social media data about the site. Furthermore, we can use this approach in case of fraud detection, finding anomaly behavior in network traffic, aerial traffic prediction, and maybe in many other situations which you may know.

After a while of running this service, We found a critical issue. In the real world application, even after applying lots of ideas, we can not find a good similar image to the image that describes the current state of traffic. Why? Because we have not enough data for that mapping.

So we started to look at the problem as a standard machine learning problem. But nothing works well enough when we have lots of missing data.

But we were eager to find a solution to our problem. So we started to research how to use graph structure as a feature of a predictive model. At the starting point, We used some bayesian methods as a baseline. Then we used deep learning seq2seq RNN models to solve the problem. This solution had a very good benefit in comparison with our older solution and that was we could predict traffic of some edges, even without any live data of that edge. Because traffic is like a current in our graph, And we want to know how this graph is evolving.

How we combined some machine learning methods for traffic prediction.

Written by Matt