Watch Those Toes!
CAN vs SPI: Because reliable data transmission matters when you live with a robot (part 1)
Q: What do a modern car and an advanced robot have in common?
A: Both are sending a LOT of data between systems, “under the hood”.
Hard to believe that a little robot has all that much going on? Well, it may not be the crazy amount of data that cars have to send around these days, but the more computing power in a robot, the more complexity comes with it.
Want computer vision? You need that cell-phone level processor. Want tight motor control? You need a microcontroller capable of real-time operation. Want a half-dozen obstacle detection sensors wrapped around the robot, some edge sensors so the robot can’t drive off the stairs, an IMU to help the robot drive straight, and bump sensors so that it knows when it has hit something? You need a microcontroller with lots of pins and communication capability.
Especially given the real-time constraints of motor control, no single processor is going to meet all of these requirements. And, in addition to multiple processors, if you want a robot larger than a deck of cards, your sensors are spaced far enough apart that you need to consider how wires are run. Poor wiring layout can vastly increase signal corruption due to the length of the wires and the electrical noise caused by other wires nearby. This in turn can dictate the location of the processors, and it may mean they’re not necessarily located on the same board.
Add all these separate data input and processing system requirements together, and you can see that transmitting significant amounts of digital data over relatively large distances both quickly and reliably can be a real challenge. So… what’s a robot to do?
Well, a variety of communication protocols have been created over the years to standardize how these issues of data transmission are addressed. These protocols define architectures and rules for sending data between different parts of a system. And each protocol has strengths and weaknesses that make it more or less suitable for use in any given environment.
You may be surprised that something that seems as basic as a communication protocol can be very important at all. But for a robot that relies on sensor input to interact with humans, the wrong protocol can be a nightmare of dropped or corrupted data, with faces unrecognized and feet driven over.
To help ensure that our Misty robots got off on the right foot with their human interactions, we had to dive deep into the world of communication protocols. Read on to find out what we learned.
55 seconds (or less!) to understand communication protocols
Overall, the topic of communication protocols is huge. Protocols range from models for the physical network structures hidden deep inside devices to specifications like the Hypertext Transfer Protocol (yes, HTTP) that you use every day.
We’re really only going to focus on the lowest level here, because when you’re designing a new hardware product from the ground up, you need to think about your communication protocol choices from the bottom up, too. This lowest level (layer one in the OSI Model) is called the physical layer.
The physical layer is where choices are made and rules are set regarding data in the purest sense. At this level, you don’t care about what’s in the data being transmitted. You’re only concerned with the bits getting where they need to go, accurately and quickly. USB, Bluetooth, and IEEE 1394 (“Firewire”) are all examples of standards that include specifications for physical layer communications.
A specific physical layer communication protocol specifies whether data is sent
- serially or in parallel
- synchronously or asynchronously
- via a given network topology
- and a whole lot more
Why does all this matter? Because what seems like abstract and very basic choices about the form and transmission of data can cause quite different effects when put into use on actual systems.
Communication protocols in practice — what do you need?
It may sound obvious, but physical layer communication protocols vary in the extreme. Given that, the selection process is all-important. The following are the basic criteria that you’ll need to evaluate, because whichever protocol you choose must…
- Handle your data load. Start by figuring out how many messages a specific system is required to handle over any given timeframe. Then, take that number, add enough overhead to accommodate any problems that might occur, as well as to account for the possibility of additional sensors or systems that might be added in the future. Select a protocol that is capable of sending that amount of data, and choose the lowest data rate that still allows that much data to be sent. It may be counter-intuitive, but it’s not always a good idea to maximize the data rate of the chosen protocol. The faster the message is pushed down the wires, the more the signal will be degraded. And, for various reasons, longer transmission distances require lower data rates.
- Deal with the physical limits of the device. Not all protocols are suitable to distances greater than a few centimeters, for one thing. And physical construction can limit the number of wires that can fit in your device, so the protocol had better not require a bundle of wires as thick as your thumb to carry the necessary load. Different protocols also have a different number of maximum nodes on the system that they can easily handle.
- Be immune to the electrical noise that comes with working motors. For robots or other moving objects, signal interference from motors can be a big issue. Some protocols are more susceptible to data corruption from noise and thus must be able to be separated from motors and moving parts, which is not always possible on a robot.
- Make firmware easy — or at least not too hard. Some protocols by nature are easier to work with than others. The simpler a protocol, the easier it is to write firmware to handle it, but it also generally means the firmware has to do more, specifically in terms of synchronization, error handling, and parsing. Some protocols are native on most systems, whereas some require additional hardware. Choosing protocols that are native on the chip allows hardware to do more work, and thus can simplify some of the firmware.
- Not cost a (robotic) arm and a leg. Cost is always a factor that is considered in any engineering project. More wires and more supporting peripherals all add to the cost. Natively supported protocols are less expensive, because they remove the need for additional peripherals.
- Reliably get messages to where they are going. This is the bottom line. Choosing a protocol that can handle the environment into which it is being placed is vital. The poorer the reliability of a protocol is on a given system, the less likely it is that you can make up the difference in software.
Comparing the candidates
Out of the dozens of possible physical layer communication protocols, only a few were likely candidates for building our robot. Note that we looked at both synchronous and asynchronous protocols. While data synchronization methodologies differ dramatically between the two types, in the end it’s still a matter of overall data rate and message reliability, and you can achieve good results for both with either protocol type.
LIN (Local Interconnect Network): This asynchronous serial protocol is low data rate, but is designed for the longer distances and electrically noisy environments of automobiles, which means it would be good in robots as well.
CAN (Controller Area Network): Even though this is an asynchronous protocol, the way the protocol is designed creates the ability to re-sync to the incoming bits throughout the packet transmission, and thus data rates up to 1 Mbps (older standard) and 8 Mbps (newer standard) are possible. Like LIN, this serial protocol is designed for noisy automobile environments, so it’s also a good choice for robot communications.
RS-485: This asynchronous serial protocol is often considered less a protocol than a standard for signal levels in an electrical interface. RS-485 is only defined on the physical layer of the OSI model, so it’s up to the firmware engineer to determine the higher layer aspects of the protocol, which means more work. But, it has similar bit rates and hardware requirements to CAN and is also good for noisy environments. These characteristics make RS-485 good for uses ranging from industrial control systems to hand-held controllers for model railways to video surveillance systems and theatrical lighting networks. And maybe robots.
I2C (Inter-Integrated Circuit): Because this synchronous serial protocol includes a clock line, it has a moderate data rate. And it’s designed to allow multiple nodes to be able to communicate on the bus. But, this protocol is much more suitable for chip-to-chip communication on the same board, and not generally recommended for long wire run communication. It’s also fairly noise sensitive, so that’s a red flag.
SPI (Serial Peripheral Interface): This synchronous protocol is the highest data rate communication protocol of all that we considered. It’s often found on communications between off-board peripherals and microcontrollers and is frequently used in embedded systems.
Other synchronous protocols: There are also Ethernet and USB protocols that offer much higher data rates, but because they also operate well above the physical layer (even above the data link layer), they are much more complex and have much more overhead.
Bottom line: what about Misty?
Our initial list of protocols above was still too lengthy for deep comparative performance and reliability testing, so we picked two top contenders, based on the following:
- Data Load/Rate. Working up a basic data budget, we determined that a protocol that could handle about 300 kbps or more was sufficient for Misty’s needs, which included a large overhead for user-added hardware on the backpack. The protocols that work for this are I2C, CAN, SPI, RS485, Ethernet, and USB. This puts LIN out of the running, however.
- Physical and Noise Constraints. Data moves throughout the Misty robot on long wires (greater than 300mm) in a noisy environment. This means I2C won’t work, and SPI is questionable. RS485 and CAN are built for that type of environment.
- Realistic Firmware Requirements. The protocol chosen should be relatively light. The processors need to do many other things beyond communication and should spend as little time as possible communicating. Because they operate on upper layers in the OSI model, Ethernet and USB are overkill for this system and are removed from consideration. Reducing the amount of effort required by firmware is desirable, and the CAN data-link layer eases the amount of work the firmware needs to do, so that makes it one of the front-runners.
- Cost. RS485 isn’t supported natively by any of the nodes in the robot, whereas CAN is supported by the sensor and motor control nodes. SPI is supported by all three nodes. Even though SPI isn’t meant for such long wires, it might still be worth considering simply because it saves hardware, and thus reduces cost. RS485 is just going to be too costly, though, it’s no longer a contender.
- Reliability. For this constraint, actual testing between the top two contenders will be performed in the robot running a real-world skill that stresses the communication channel to see which is actually more reliable, not just based on theory.
With LIN, Ethernet, USB, I2C, and RS485 eliminated, that leaves SPI and CAN as the two protocols worthy of deeper investigation.
SPI is low cost and one of the easier protocols to write firmware for. That said, it’s not intended for long wires, and it requires more wires, which means tighter squeezes through certain points of the robot. But none of these problems are absolute show-stoppers and SPI definitely will get data around the robot.
CAN may require an additional handful of parts added to the electronics, but it’s meant for noisy environments, has few wires, can handle the data load, and is easy to work with in firmware. CAN also has the benefit of having proven itself reliable in industrial applications, which is great for the rest of us, now that this state-of-the-art protocol is becoming more practical for general use.
In Part 2, I’ll go into a bit more depth on CAN vs SPI, before describing the test procedure and results. Teaser: We’ve implemented and shipped the winning solution with Misty I, and we’re pretty happy with the results.