How I Built a Robot That Solves the Rubik's Cube Faster Than Anyone Else in the World

Published in

Yandex

16 min readJust now

Hey there! Today, I'd like to share my journey from a complete Rubik's Cube novice to… well, still a novice, but with a robot that can solve it faster than anyone else.

First, I want to say a few words about my background. I have extensive experience in software engineering and am currently working on a run-time infrastructure for large language models at Yandex. But robotics? Not so much there—the most I'd done was play with LEGO MINDSTORMS.

It all changed when I saw a video of an MIT robot solving a Rubik's Cube in 0.38 seconds. I re-watched the footage in slow motion and realized there was room for improvement. "Wow, I could set a new record here," I thought. As luck would have it, I was working with a team of robotics experts who had all the gear I needed and were eager to support me in this challenge.

Read on to learn how I transformed a raw concept into a new world record despite lacking the necessary expertise and making mistakes at every turn. This journey illustrates the saying, "Where there's a will, there's a way."

Planning

Solving a Rubik's Cube involves three simple steps:

Capture the cube's state: Using a dual-camera setup, where each camera captured three sides of the cube, I obtained a complete image in a single shot.
Find the solution: I used the popular two-phase Kociemba algorithm. It's fast and good enough for my needs.
Execute the solution: Probably the most difficult stage. This is the part where I make robotics happen, and it’s the primary focus of my story today.

I decided to start with the hardware component. After all, what’s the point of all the fancy algorithms if I don’t have a physical platform to manipulate the cube? At about the same time, I tried rewriting the original solver from Python to compile languages like C++ and Rust for performance reasons before eventually finding a good out-of-the-box solution. As for the computer vision part, I soon realized that manually adjusting the color parameters wasn’t a scalable approach. I decided to put off this task until later when I realized I could first build the robot and then use it to create a dataset for training a more robust CV model.

Finding the Right Motor

The task was straightforward: rotate one face of a Rubik's Cube by 90 degrees. Ideally, rotate all six faces (five would be enough, though that would create a slightly longer solution sequence).

I experimented with various motors, including stepper motors, Chinese GYEMS servos, and two other more promising options, which I'll explore in the following sections.

Option One: Maxon Motor

Here's a breakdown of the components I used — a Driver, Motor, Encoder and Gearbox

These motors were great to work with but came with a built-in gearhead that limited their speed to 8040/(299/14)/60 \≈ 6.27 rev/s.

I needed something much faster, about 15 ms per quarter turn, three times faster without acceleration. I think these motors could handle the torque, but I can’t provide the exact numbers.

Since the gearbox slowed things down, I figured I could reverse engineer it to speed things up. I went with that idea and spent some time tinkering with different gear ratios.

After experimenting for a while (I can't recall the exact numbers, but you can check the teeth number in the photos below), I stumbled upon the concept of planetary gears, which seemed like the perfect solution, and got to work. Conveniently, my roommate had a couple of 3D printers, an FDM, and an SLA, so I used those.

I designed the gearbox in OpenSCAD and printed it:

But there was a problem — the motor kept slipping inside the aluminum part that transferred the rotation to the “planets” via another 3D-printed part:

To fix this, I used a threadlocker to glue the motor shaft directly onto the 3D-printed part. It worked, though I’d have to break the part whenever I wanted to remove the motor. Still, that was good enough for me — I now had a working prototype:

It was at this moment that I realized that my master plan was finally starting to take shape. About time! I was beginning to doubt if I could actually build this thing. However, the custom gearbox seemed to slow things down rather than speed them up, so that was a bit of a flop.

Option Two: A Servo Drive From Improvised Materials

I was still tinkering with the gearbox when I got my hands on an ODrive v3.6. This really helped me understand how the drivers worked and why FOC and BLDC/PMSM were so popular for high-performance solutions.

To test my setup, I used these parts:

1x ODrive v3.6: A driver for controlling the motors with two channels.
2x AS5048 encoders: These provide feedback to the driver, ensuring precise current delivery to the motor windings.
2x T-MOTOR U8 Lite KV85 motors: Brushless DC motors from China, commonly used in large drones.
A couple of magnets are attached to the motor rotors and are required for the encoders to function.

I assembled the components using Dupont connectors and used SLA-printed shafts to connect the motors to the cube (I did the design in Fusion 360 this time). Once powered up, the thing moved incredibly fast — no wonder they use those motors for drones:

The algorithm I went with was R F’ R’ F R F’ R’ F. The speed was much faster than in the previous iteration, and I maintained it until almost the end of the story.

Solving the Cube

I’ve tripled the scale and gone for soldering instead of Dupont. Check out the results:

Fine-Tuning the Setup

Building a setup with six motors, I realized those Dupont wires just didn’t cut it for a serious project. I also thought that soldering everything by hand wasn’t a nice solution, so I decided to design a printed circuit board. This would allow me to neatly connect encoders, a camera flash (as I was experimenting with computer vision, I realized steady lighting was very much needed), and even a CAN bus to the controller (I opted for an ESP32 microcontroller to minimize the delay). The result was an expansion board that easily mounted on the ODrive:

Boy, was I wrong! I had no idea at the time, thinking the board would do just fine. The flash was controlled by a MOSFET placed between the LEDs and the voltage regulator. This created a PWM dimming effect, and to top it off, I used an optocoupler for isolation, keeping the control electronics safe from the 15-volt supply. This seems perfect, right?

I connected the encoders using RJ45 connectors and a twisted pair. I figured, “It’s just a wire; what could possibly go wrong?” So, I sent SPI signals down these long cables without a second thought about which signal I was transmitting. I could be sending two different signals through the same pair. And somehow, it worked! Looking back, I’m amazed it functioned at all, given the interference between the wires and the fact that it only worked when the cables were in a specific position.

A Little Bit of Computer Vision

Now that we’ve got a robot, cameras, and a flash, it’s time to make things more autonomous. Imagine having to tell the robot what the cube looks like every single time — that would take forever.

So I let the robot run for a few hours and collected a dataset of pictures using PlayStation Eye cameras and the flash:

I wouldn’t want to set HSV thresholds manually — I’d need to do that for each element because, as it turned out, identical pixels may represent different colors depending on their position in the image. Those cameras can’t accurately capture colors, especially when the lighting is uneven.

No worries! With N images in our dataset and knowing the position of each color, we can easily get masks for each element using simple Boolean operations with threshold values. By averaging the colors within these masks, we can form well-defined clusters. When you hear the words “machine learning,” you’d typically think of fancier techniques, but that’s exactly what’s going on here.

That wraps up the object recognition part. A quick note: I optimized the image processing by switching from Python to Rust, which gave it a 100x speed boost and reduced processing time to 0.5 milliseconds (there’s still a chance I didn’t make the most out of both languages). I also wrote a custom camera driver in Rust to eliminate unnecessary delays and get image data as quickly as possible.

This algorithm is remarkably simple and can be calibrated in a new environment within five minutes (by collecting a dataset of 200 images), recognizing the cube with 100% accuracy under stable lighting conditions. I later replaced the LEDs and found that it worked well, even under worse conditions with heavy shadows and glare.

Back to the Drawing Board

At that point, I wasn’t happy with how sensitive the encoder wires were to their position, and the acrylic panel was starting to crack. The ESP32’s lack of true USB, relying solely on the slower UART, was a limitation.

I migrated the code from ESP32 to Teensy 4.0 — once again using Rust because I had already converted to crab by that time, and besides, the project demanded blazingly fast performance.

Let’s start with the encoders. This time, I knew better than to send different signals over a single twisted pair. I was familiar with differential signaling and knew that adding a series resistor at the signal source could reduce noise. I decided against differential signaling to save space on the board but made other drastic changes.

Here’s what I did:

Instead of using off-the-shelf Chinese encoder boards, I designed my own to ensure proper mounting and have the freedom to choose any connector I wanted.
I simplified the ODrive expansion board by removing unnecessary electronics (keeping the old board as a backup).
I replaced the Ethernet cable with a full-featured USB-C cable with enough wires to support high-speed data transmission.

I went with USB-C because it’s just great. Twisted pairs were enough to twist each signal with ground (or even with the opposite phase signal if I wanted to get fancy). Type-C cables presumably have decent shielding to reduce noise.

USB-C connectors were a pain. Since I needed almost all the wires in the cable, I also needed ports with USB-C pins. My soldering wasn’t all that fantastic, and each instance took forever. Plus, I messed up a couple of times, creating a few solder bridges under the port shell. I finally gave up and took the rest to a phone repair shop for soldering.

The wiring proved to be even more challenging. I needed a one-meter cable with all the contacts, plus a passive cable. It was tough to find. Most long cables on the market only have one twisted pair and power. If you want something else, you need to fork out and just hope you don’t end up with an active cable that’s going to mess up your signal (because I’m not using differential signaling, and also the voltage is different). Luckily, after some time searching on a marketplace, I finally found the cables that worked.

The robot’s frame was upgraded to a steel structure. It turned out that steel was even cheaper to cut than acrylic (but maybe I was just lucky again), giving the robot a much more robust appearance:

Now, there was a handle, making the machine perfectly portable so I could carry it around like a boss. I planned to make a video of me carrying it and waving it around while it solved the Rubik’s Cube, but I never got to it.

After tweaking the motors, I managed to cut the robot’s solving speed below 300 ms, making it the fastest in the world (according to the standing record, at least).

Each frame represents one millisecond. Recorded on Sony RX100 V.

After all this time, the robot had already started to rust, which is why we named it RustyCuber. I used regular steel, so it was bound to happen.

What’s more, as I was tweaking the motors and putting a lot of strain on the parts, one of the SLA shafts shattered.

Software Component

But enough about the hardware — let’s look under the hood! Though I didn’t do anything revolutionary on the software side, I do want to give it a brief mention just because it brings the whole project together.

Early prototypes were developed using Python for the host and C++ for the ESP32 embedded system, leveraging ESP-IDF and FreeRTOS. A complete rewrite in Rust ensued, except for one Python notebook where I played around with the cube recognition algorithm.

The embedded system was implemented on Teensy 4.0, running a lightweight async framework called Embassy. Communication with the host was via the controller’s native USB 2.0, which was much faster and more reliable than using a UART-USB converter. For the protocol, I used a simple RPC over postcard, a neat binary format that is fast and doesn’t waste space. Previously, I used serde_json, but it was too bloated for an embedded system — it took up nearly half my binary, and memory is tight on microcontrollers.

Ultimately, a request-response cycle involving an empty method call on the controller took 90 µs, including all overhead on the host and so on. For a complete cube solve, I only needed two requests, so I decided that was sufficient optimization. I’m unsure how much more time could be shaved off, but to achieve this result, I had to disable Turbo Core due to its random delays of 0.1–0.5 ms — those were quite annoying.

I developed a number of host-side applications to support the system. These included a camera visualization tool and a PID controller calibration utility (disregard the charts; this was before I fixed the issues with the current controller and overshot)

I made a special mode for expos: the robot continuously scrambles the puzzle, and then when you press a button, it scans the image, solves the cube, and shows you the time. It was great for putting on a show. However, I did have to cheat a bit: the lighting at expos isn’t always great (for starters, it changes throughout the day), resulting in occasionally inaccurate color recognition. To mitigate this, I simply taught the robot to maintain a record of the cube’s current state.

Open Sauce 2024

Since my robot was the fastest in the world and this awesome expo was accepting submissions, I thought, why not apply? As an exhibitor, I got two free tickets, but I ended up using just one. Still, with such a cool event coming up, I wanted to give my robot a makeover.

I ordered a new galvanized steel frame to prevent rust and had aluminum shafts custom made. I also redesigned the Teensy 4.0 expansion board to have it power the ODrive coolers and installed special LED drivers for more precise, current-based control instead of voltage-based (they also had built-in PWM dimming that worked better than the previous setup):

I arrived in California, checked into a hotel, and… barely ever left it, preoccupied with soldering, coding, and testing. So much for a vacation, eh?

A few days before the expo, I was tuning the motors (as I was traveling to the US, Mitsubishi Electric challenged my claims of having the world’s fastest robot by setting a new record, so I had some catching up to do) when I suddenly discovered that one of them was malfunctioning. Running short on time, I had to go along with a broken motor. Luckily, it turned out five working motors were enough. The configuration was fast enough, too, so no one suspected a thing. Well, almost no one: one sharp-eyed kid noticed that one side of the cube wasn’t moving, so kudos to him for being observant.

At Open Sauce, I met Oskar from ODrive Robotics. He offered a collaboration: they would provide me with their latest drivers and offer internal tools and expertise for tuning them, which would further enhance my robot’s speed. There seemed to be no obligations on my part other than registering a new record, which I was planning to do anyway. I also met a guy with a cool slow-motion camera that produced better footage than mine:

Toward the end of the expo, the cube started to show signs of wear. The lubricant had degraded, causing the pieces to stick, and the synchronization logic between adjacent sides got slightly off. This resulted in some interesting footage demonstrating the limits of its reliability:

When I returned to the hotel, I figured out what was wrong with the motor: it was the poorly positioned encoder magnet. I’d just slapped these magnets on without thinking, and it worked fine until it didn’t. It turns out that those magnets have to be in the exact right spot. Then the sensors started acting up, too — they were really sensitive to how the wires were routed. Most likely, the cheap wires I used had deteriorated and were messing things up.

I also found out why they give you two tickets: while sitting there alone for two days showcasing my project with no breaks for food or drinks was pretty fun in its own weird way, it left me with no time to go out and take a stroll around the venue myself. I only dared to leave my stand once, and that was after I was told that CubeStormer 3, one of the previous record holders, was also here, on the opposite side of the expo. Lucky for me, one of its creators was also attending, so I enquired about the process of registering the record. He shared his experience and told me that I was the first person in the entire venue to ask these questions.

World Record

The Guinness World Records has several requirements regarding the evidence provided:

The Rubik’s Cube and the scramble must comply with the World Cube Association rules
Cameras cannot track more than one side of the cube before the timer starts
The time includes everything from recognizing the scramble to fully solving the cube
Two independent witnesses and two experienced timekeepers must be present

I spent the next few weeks in the hotel, tuning, coding, and designing. I had to upgrade the drivers from ODrive v3.6 to ODrive Pro and replace the custom encoders with AMT212B, which connects to the ODrive via RS485, providing a proper differential signal. These encoders mount directly onto the shaft, so I had to assemble a new shaft from available materials.

I discovered that the amount of tension on the cube was critical. I knew it mattered before, but now I knew how much: here’s what happens when it’s tight but not tight enough:

To shave off as many milliseconds as possible, I optimized across the board: upgraded the host computer for faster calculations, tweaked the communication protocol with Teensy, overclocked the processor to reduce USB latency (which fluctuated around 0.5 ms with AMD Core Performance Boost on, but was consistently below 0.1 ms when off), fine-tuned angle thresholds, and optimized CAN bus performance.

After tuning, the robot achieved a solving time of approximately 160 ms, with an additional 20 ms for computer vision and algorithm processing, resulting in a total record time of 180 ms. Even at 40 times slower in slow motion, the process appears remarkably swift.

I could’ve tweaked it a bit more to make things faster, but I was running out of time with that July 5 deadline — so I just had to stop there. One thing I could have done was to make the solver smarter about the kinematic constraints of the specific robot. For example, the algorithm could have been adjusted to recognize that a 180º rotation takes approximately 1.5 times longer than a 90º rotation. Alternatively, the algorithm could have been modified to occasionally rotate a face in the opposite direction by an angle of -180º instead of 180º for better corner cutting.

The record requirements were strict: I had to set up the robot’s cameras so they were blind until the flash turned on, and then it would turn off as soon as the cube was solved. So, we’re just looking at how long the lights are on.

We gathered at Noisebridge, set everything up, recruited the necessary team members (including independent witnesses conveniently found on-site), calibrated the cameras, and ran the record twice. To ensure reliability, we operated the robot in a slightly slower mode:

There were actually two attempts, both captured in wide shot:

Sadly, I forgot to turn on slow-mo on the first attempt, so we had to take another one. For a moment, I really thought we’d managed to get below the 0.2 mark there.

The evidence was then submitted to Guinness World Records. Whether I’ll make it into the book remains to be seen, but at least I have official verification of my robot’s speed.

Once I caught my breath from all the engineering, I realized a couple of things:

I had run the record attempt with a slower code setup and could have easily shaved off 1–2 ms.
I had been dealt one of the worst possible cube scrambles. Afterwards, using the robot’s statistics, I calculated the distribution of solve times for the current configuration and realized that I was pretty unlucky:

There are a few more things I could’ve done better in the record run:

The robot is able to solve the cube even faster, although I don’t have enough data to generate relevant statistics.
I have some new ideas about how to speed up the process significantly, and I’ve found faster cameras that I think will save me a total of 5–10 ms, but I haven’t tested the setup yet.
I’ve made some tweaks to the configuration for getting the solution, and the results are as follows: