A MIDI-Like Protocol for Talking with an Embedded System

an extension to the Arduino Firmata protocol

To send data between an embedded system (e.g. Arduino, mbed, raspberry pi or any other bare-metal MCU setup) and a computer, it is common to send data in unit of bytes at the application layer eventually, no matter what the underlying communication stack being USB, serial, SPI, BLE or some other network protocols.

What is a useful data protocol at the application level?

We will show an extension to a MIDI-like protocol that is extended from Arduino Firmata.

The sample code is in Arduino at the embedded side, and in Python at the computer side. The purpose is to illustrate the evolution of the protocol step by step. To follow along, you will need an Arduino board, the Arduino IDE, and Python3 with pyserial installed.

The protocol itself can be easily implemented, or extended for your needs, in any other programming languages.

1. A simple character-based (ASCII) approach

You want to send command from the PC to an Arduino to turn its LED on/off. It is simple to send strings delimited by the “new line” character from the PC and the Arduino can receive and then parse that string, knowing that it ends with a “new line”.

Arduino side

const int ledPin =  LED_BUILTIN;// the number of the LED pin
void setup() {
pinMode(ledPin, OUTPUT);
// initialize serial:
Serial.begin(9600);
}
void loop() {
// if there's any serial available, read it:
while (Serial.available() > 0) {
String st = Serial.readStringUntil("\n"); //new line delimiter
// parse incoming command
if (st == "on") {
digitalWrite(ledPin, HIGH);
} else if (st == "off") {
digitalWrite(ledPin, LOW);
}
// echo back
Serial.println("LED: " + st);
}
}

PC side

You can open the “Serial Monitor” inside Arduino IDE to send characters “on” and “off” to see the LED respond accordingly.

Be careful of the “newline”, “carriage return” issue, when passing strings back and forth.

Here is the Python code doing the same thing. Notice that you need to delay a bit for the Arduino to “boot up” (pass bootloader check) upon serial connection.

import serial, time
ser = serial.Serial("COM33", 9600)
time.sleep(2)
# after Arduino bootup
ser.write(b"on\n")
time.sleep(1)
print(ser.read(ser.in_waiting))
ser.close()

2. A MIDI-like protocol

Sometimes raw binary bytes, rather than characters, seem to be a better fit for the application.

The MIDI-like protocol uses the highest bit (MSB) of a byte (being 1) to indicate a “command” byte, differentiating it from the “data” bytes (MSB being 0) that follow. You can think of the “command” byte as a “tag”, just like “<p>” is a HTML tag.

<p> data </p>

Here is a little state machine diagram of the protocol.

For example, 3-byte: {0x81, 0x03, 0xFF} is a valid command. The leading byte has MSB=1, the data byte is 3, and the last byte (MSB also = 1) ends this command. Notice that you can use any of 126 (=127–1) “tag” bytes to define your own custom commands, the data bytes can be of arbitrary length (limited by input buffer size), as long as it ends with another “tag” byte (e.g. END_CMD).

Arduino side

const int ledPin =  LED_BUILTIN;// the number of the LED pin
// internal FSM
enum State {
WAIT,
DATA
} state;
unsigned char cmd_byte;
#define BUF_SIZE 200
unsigned char data_byte[BUF_SIZE];
int data_ptr = 0;
//define cmd const
#define END_CMD 0xFF
#define LED 0x81
void setup() {
pinMode(ledPin, OUTPUT);
// initialize serial:
Serial.begin(9600);
state = WAIT;
}
void loop() {
// if there's any serial available, read it:
unsigned char inByte = 0;
while (Serial.available() > 0) {
inByte = Serial.read();
process_input(inByte);
}
}
void process_input(int input) {
if (input != -1) {
switch (state) {
case WAIT:
if (input & 0x80) { // MSB == 1
cmd_byte = input; // store valid command
state = DATA;
data_ptr = 0; // prepare for data to follow
}
break;
case DATA:
if (input & 0x80) {
// 1. execute
exec_cmd(cmd_byte);
// 2. new or end_of command
if (input == END_CMD) {
state = WAIT; // wait for next command
} else {
cmd_byte = input; // store the new command
data_ptr = 0;// prepare for data to follow
}
} else { // data bytes
data_byte[data_ptr] = input;
data_ptr = (data_ptr + 1) % BUF_SIZE;
}
break;
}
}
}
void exec_cmd(unsigned char cmd) {
switch (cmd) {
case LED:
if (data_byte[0] > 0) { // take one data byte
digitalWrite(ledPin, HIGH);
} else {
digitalWrite(ledPin, LOW);
}
break;
}
}

PC side

Because the data bytes are beyond the range of ASCII strings, you can not “type in” bytes a terminal window to send forward commands.

Here is the Python code on the PC side to send forward commands.

import serial, time
ser = serial.Serial("com33", 9600)
time.sleep(2)
# after Arduino bootup
ser.write(b"\x81\x01\xFF") # turn on LED
time.sleep(5)
ser.write(b"\x81\x00\xFF") # turn off LED
ser.close()

This protocol involves a tiny 2-state finite state machine (FSM). It has certainly more code than the earlier character-based protocol. One big advantage is that now the forward command can have data bytes as its parameters.

For example, you can imagine bit-banging a GPIO pin with commands like this

ser.write(b"\x81\x00\x01\x01\x00\x01\x00\x01\x81\x01\x00\x01\x00\xFF")

The problem with this protocol is that there are only 7 bits left to encode data. It works fine for ASCII characters, or integers within the range of 0 to 127. For 8-bit or 16-bit ADC data, one has to break those into multiple 7-bit bytes.

3. Extension to the previous protocol

To solve the problem of 7-bit data limitation, we add a data stream “state” to the MIDI-like protocol.

The idea is to put “tag” byte 0xFE around the 8-bit data. Of course, the exact byte 0xFE needs to be “escaped” by the 2-byte sequence: {0xFD, 0xDE}. At last, the “escape” byte itself is represented by the 2-byte sequence: {0xFD, 0xDD}.

Here is the state diagram again with the added “stream” state.

Arduino side

As a demonstration, we added a command to send PWM to one Arduino pin. Because the PWM value ranges from 0 to 255, the stream state is handy to receive these data bytes as 8-bit bytes.

const int ledPin =  LED_BUILTIN;// the number of the LED pin
// internal FSM
enum State {
WAIT,
DATA,
STREAM,
ESCAPE
} state;
unsigned char cmd_byte;
#define BUF_SIZE 200
unsigned char data_byte[BUF_SIZE];
int data_ptr = 0;
//define cmd const
#define END_CMD 0xFF
#define LED 0x81
#define PWM_OUT 0x82
#define TAG 0xFE
#define ESC 0XFD
void setup() {
pinMode(ledPin, OUTPUT);
// initialize serial:
Serial.begin(9600);
state = WAIT;
}
void loop() {
// if there's any serial available, read it:
unsigned char inByte = 0;
while (Serial.available() > 0) {
inByte = Serial.read();
process_input(inByte);
}
// echo back
}
void process_input(int input) {
if (input != -1) {
switch (state) {
case WAIT:
if (input & 0x80) { // MSB == 1
if (input == TAG) {
state = STREAM;
data_ptr = 0;
} else {
cmd_byte = input; // store valid command
state = DATA;
data_ptr = 0; // prepare for data to follow
}
}
break;
case DATA:
if (input & 0x80) {
//1. execute
exec_cmd(cmd_byte);
//2. new or end_of command
if (input == END_CMD) {
state = WAIT; // wait for next command
} else {
cmd_byte = input; // store the new command
data_ptr = 0;// prepare for data to follow
}
} else { // data bytes
data_byte[data_ptr] = input;
data_ptr = (data_ptr + 1) % BUF_SIZE;
}
break;
case STREAM:
if (input == TAG) {
state = WAIT;
} else if (input == ESC) {
state = ESCAPE;
} else {
data_byte[data_ptr] = input;
data_ptr = (data_ptr + 1) % BUF_SIZE;
}
break;
case ESCAPE:
data_byte[data_ptr] = (input | 0x20);
data_ptr = (data_ptr + 1) % BUF_SIZE;
state = STREAM;
break;
}
}
}
void exec_cmd(unsigned char cmd) {
switch (cmd) {
case LED:
if (data_byte[0] > 0) {
digitalWrite(ledPin, HIGH);
} else {
digitalWrite(ledPin, LOW);
}
break;
case PWM_OUT:
for (int i = 0; i < 5; i++) {
analogWrite(3, data_byte[i]); // PWM Pin = 3
delay(1000);
Serial.println(data_byte[i],HEX);
}
break;
}
}

PC side

On the computer side, we first send a bunch of data bytes that determine PWM duty cycles, then we start PWM with the command byte.

import serial, time
ser = serial.Serial("com33", 9600)
time.sleep(2)
# after Arduino bootup
ser.write(b"\xFE\x10\x80\x9F\xAF\xFD\xDE\xFE\x82\xFF")
time.sleep(7)
print(ser.read(ser.in_waiting))
ser.close()

You can use such PWM for a simple DAC to generate sinusoidal waveform.

4. Conclusion

To send/receive data between your embedded system and the PC, it is essential to have a versatile data communication protocol, due to the fact that the underlying communication stack may have various kinds of idiosyncrasy, such as packet size limit, timeout, time delay etc.

The tag-based protocol introduced here can be applied to all these situations. The Arduino code just serves as a demonstration, you can define your own protocol once you understand the key ideas here.

Some references

firmata protocol

HDLC

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.