In early 2017 I stumbled across one of the documented Google Cloud and Tensorflow use cases titled How a Japanese cucumber farmer is using deep learning and TensorFlow. It stuck in my mind and sparked my curiosity.
It was around December 2017 that I decided to do something about it and started thinking hard on how to do it myself. I brought it up in conversation with friends, and I started to get a lot of good guidance and advice. Two of them were instrumental to achieve the goal:
“…the software is easy, but getting to a reliable mechanical design will be the hard part…”
“…get a 3D printer, it will make your life easier…”
I started from zero, with no knowledge about any of the technologies I would use beside Python and it roughly took me 200 hours spread across 6 months.
I defined my objective as: Build a machine that can reliably sort between 10–20 types of Lego bricks reliably without manual feeding and using an image-based neural network classification model.
I was confident I could take the Google Cloud use case and replicated it. As it’s true in any technology endeavor along the way I identified a lean path to the objective and made some trade-offs to achieve a reasonable ‘time to market’.
5 Parts to This Blog
This part of a 5 Blog Series to cover the mechanical and software design for the Lego Sorter, as well as sharing the training set and some evaluation sets:
The Machine in Action
Overview of the Design
I will provide more details in the Mechanical and Software blogs, but at a high level, this is how I designed the separator:
I’m using a Motor and Servo HAT, as well as a custom board to control the IR Beam Sensors and backlight LEDs. I’m using GPIO and PWM signals in Python to control the movement of the entire machine and using image recognition using OpenCV to detect any shortcomings in the mechanical separation (e.g. two Lego pieces in a single image).
Training Set and Classifier Approach
I used a retrained Inception V3 model to classify the 11 brick classes. I ran the training on a GPU TensorFlow library that leverages my desktop’s CUDA enabled NVIDIA GPU.
Results for First Run
Disclaimer: Below is the results of the first run and they are quite exceptional. I do believe there will be significant variation across runs and I expect the yield to fluctuate in the 75–85% range.
How does this first run compare to the Cucumber Farmer?
My initial run was highly accurate in terms of mechanical and classifier accuracy, but I did see the same drop as mentioned in the article when you go from the trained accuracy output to the real-world implementation.
I came very close to replicate the case with the key differences being:
Automatic Feeder and Separation: Having a automatic feeder and separation mechanism automated the capture of the training set, which provided a material time saving.
Training Set and Camera: My setup has a single camera and a training set 3 times smaller.
Overall, this is how the scorecard came out: