Taking on border jam with YOLOv3 — part 1

Vanja Tesin
4 min readSep 11, 2018

--

Returning from vacation and then going through a jam-packed border crossing, while there are two more with barely any car going trough, can be a good motivation for a pet-project. Like, having 3–4 hours to think about why did I choose this one and not the other two.

Border camera feed with detections

Maybe there are systems like this out there, but this seems to be a nice, well rounded, project to build a lot of interesting things from the ground up.
The goal of this endeavor should be — analyze any camera feed with cars, track them and calculate the rate of movement to infer the speed the border crossing deals with the traffic, and how much traffic there is. This should be exposed as a service to be used by any and all interested parties.

In this part of the installment, I’ll try to focus on the proof of concept built to get the images from the stream, feed them to the detector, obtain the predictions and tracking id’s. If possible, we should be using open-source solutions to speed up the whole process. Oh, and of course C#.

If you’re interested in details to how Yolo works, there are very good articles out there.

StreamCapture
To capture video stream from the camera, I’ve taken Vlc.DotNet.Core.3.0.0-develop296 version, as the older one seems to be missing some events to hook into getting individual frames. Sadly, this didn’t help much, as there seems to be a stream-to-stream issue with capturing frames. I’ve tried M3u8 feeds from official sites, and these worked fine, but from the border crossing, I could not, for the life of me, get a working image capture.
We need each frame to feed into the detector, so I’ve taken the long road to capture a couple of seconds of the stream as MPEG, and then use another component to extract the video frames.

Enter Thumbnailer
Getting the thumbnails from the MPEG video stream was a breeze with MediaToolkit.1.1.0.1 — just set up the engine with your MPEG stream, get the metadata and for each second of stream call GetThumbnail.

We seem to be good to go — now the fun part — detector.

Detector
Since I’m using C# I needed a wrapper and, possibly (i know dot net core) Windows build of Darknet with YOLOv3. Luckily there is a fork from the official repository on GitHub, thanks to AlexeyAB. The readme file contains a very detailed explanation of YOLOv3, the repository build instructions for both Linux and Windows, and how to build YOLO as DLL, which is what I needed.

The example on the site is C++ only, but there is a C# wrapper in the darknet\build\darknet folder. The wrapper supports initialization and detection using YOLO, but not tracking (as of now).

Adding tracking is easy, all you need is follow the instructions on how to build yolo_cpp_dll.sln and change the following files:

  • src\yolo_v2_class.hpp
  • src\yolo_v2_class.cpp

Change the header to also expose :

extern “C” YOLODLL_API int track_boxes(bbox_t_container &container, bbox_t_container &newContainer);

And in cpp file add the following implementation:

int track_boxes(bbox_t_container &container, bbox_t_container &newContainer) {
#ifdef OPENCV
std::vector<bbox_t> currBox;

for (int i = 0; i < C_SHARP_MAX_OBJECTS; ++i)
currBox.push_back(container.candidates[i]);

std::vector<bbox_t> newBox = detector->tracking_id(currBox);

for (size_t i = 0; i < newBox.size() && i < C_SHARP_MAX_OBJECTS; ++i)
newContainer.candidates[i] = newBox[i];

return newBox.size();
#else
return -1;
#endif // OPENCV
}

Last piece of the puzzle is in the C# wrapper:

//extern “C” YOLODLL_API int track_boxes(bbox_t_container &container, bbox_t_container &newContainer);
[DllImport(YoloLibraryName, EntryPoint = “track_boxes”)]
private static extern int TrackId(ref BboxContainer container, ref BboxContainer newContainer);

And usage of above DllImport:

public bbox_t[] Track(bbox_t[] candiates)
{
var container = new BboxContainer() { candidates = candiates };
var newContainer = new BboxContainer();
var count = TrackId(ref container, ref newContainer);

return newContainer.candidates;
}

Once I try out all these in more “production” workload, I’ll submit the pull request to Alexey, maybe it can be useful to other C# folks out there.

With all of this in place, you should be able to get the tracking id’s across the frames correctly maintained by YOLO dll. See the GitHub source of the PoC that goes with this post. Mind you, restoring some of the packages (VLC for starters) will take a lot of time. I’ve included the yolog_cpp.dll with tracking support in the repo just in case.

Tracking id’s across different frames

If you have any issues with the GitHub repository, please let me know in the comments below. Note that you need to download one of the Yolo weights, and depending on the GPU RAM you have available, some will work better then others — check Alexey’s GitHub repository page for more information.

What’s next
In the next installment, I’d like to put this into a more production-like architecture, with possible containerized environment and cloud proof future.

--

--