USB for Microcontrollers — Part 4: Handling Large Amounts of Data

Manuel Bl.
8 min readOct 1, 2020

--

To demonstrate handling large amounts of data, an image is sent from the computer via USB to the device and displayed on a display.

This is part 4 of a 4 part series:

The display and the MCU are connected via SPI (and additional control lines):

Display Project

The challenges are:

  1. The display’s SPI interface is slower than the USB connection.
  2. The image does not fit into the device’s main memory.

To address challenge 2, the image data is sent to the display as soon as an entire pixel row is ready.

For challenge 1, flow control is needed, i.e. the USB device needs to be able to tell the computer to stop sending data until the device is capable of accepting more data. A buffer for 4-pixel rows is used. It is the maximum amount of data the MCU will buffer temporarily.

USB Flow Control

To transfer large amounts of data via a USB connection, bulk endpoints are used. They get the bandwidth not used by other types of endpoints, i.e. they have low priority but usually get a lot of bandwidth. Best of all: they have built-in error handling (retransmission of corrupted data) and they have flow control.

If the computer (in its role as USB host) has a large amount of data to send, it will split it into packets of 64 bytes (for USB full speed) and send each packet separately. The USB device will immediately respond with a short message to each packet:

  • ACK: An acknowledged message confirms the successful receipt of a packet. The device will process the received data, and the computer will send the next packet.
  • NAK: A not acknowledged message indicates that the packet was correctly received but not accepted because the endpoint buffer is not empty as the device is still processing the previous packet. The computer will resend the packet later and continue trying until the packet is accepted or the operation times out.

There are further responses for error cases that are described in more detailed USB introductions. For flow control, ACK and NAK are relevant. They are the way to inform the USB host whether data can be accepted or not and thereby control the speed of transmission.

USB Peripheral Design

In order to understand how ACK and NAK responses can be controlled, a basic understanding of the MCU architecture and in particular the USB peripheral is required.

USB Peripheral

An MCU consists of a CPU, flash and RAM memory and several peripherals like UART, SPI and USB. The USB peripheral takes care of many parts of the USB implementation: the electrical encoding and decoding of signals, the lowest level of the USB packet format, USB transactions etc. It is aware of the currently active endpoints and the difference between the endpoint types.

The CPU configures the USB peripheral by writing to USB registers, and it can read the current status from the USB registers. As many other peripherals, the USB peripheral can trigger interrupts.

USB transmission is so fast that the CPU cannot deliver byte by byte as it is possible for I2C, SPI or UART. Instead, the CPU needs to write an entire packet to a special memory area from where it is transmitted by the USB peripheral without further interaction with the CPU. Similarly, when a packet is received, the USB peripheral writes the packet data to this memory area. Within the USB memory area, there are separate sections for each configured endpoint.

In the above illustration, the memory area is labelled as USB memory. In an STM32 the memory area is called PMA (for Packet Memory Area). Other MCU might have a different name but they work in a very similar fashion.

Receiving Data on a Bulk Endpoint

In addition to transmitting or receiving an entire packet without CPU interaction, the USB peripheral will also immediately respond to a received packet. For error cases, the possible responses are given by the USB standard. Regarding flow control, the CPU will need to tell the USB peripheral ahead of time how to answer to the next received packet. It can be configured separately for each endpoint.

The typical flow for receiving data on bulk endpoints is:

  1. The CPU configures a bulk endpoint and sets it into the state ready. The USB peripheral can now receive a packet and will immediately respond with ACK.
  2. Once a packet has been received, the USB peripheral switches the endpoint into the state not ready. If a further packet is received, the USB peripheral would discard it and respond with NAK.
  3. The USB peripheral triggers an interrupt thereby informing the CPU that a new packet has been received.
  4. The CPU processes the received data (or at least copies it from the USB memory to another memory area).
  5. If the CPU is able to handle more data, it sets the endpoint into the state ready so the next packet can be received (leading to step 2). Otherwise it leaves it in the state not ready until it has processed the data and sets it into the state ready later.

Transmitting Data on a Bulk Endpoint

Transmission is slightly simpler as flow control mainly affects the host side and is taken care of there:

  1. The CPU copies data for a single packet into the USB memory area and sets the endpoint into the state ready.
  2. Once the host asks if the device has data to transmit on this particular endpoint, the USB peripheral will transmit the packet.
  3. If the transmission succeeds, the endpoint is set into the state not ready. If the host ask again for data, the USB peripheral would respond with NAK indicating there is not data currently.
  4. The USB peripheral triggers an interrupt to inform the CPU that the packet has been successfully transmitted.

Display Setup

The project uses a color TFT display with a resolution of 128 by 160 pixels. It includes ST7735 controller and is operated with an SPI interface and with 16-bit colors. It is connected as follows:

Display Connections

Firmware

The main idea of the firmware is to handle received data (pixels of the image) in the USB interrupt (resulting in the call of a registered callback function) and to add the data to a circular buffer. The main loop checks the circular buffer and whenever a full row of pixels is ready, it is sent to the display. The communication with the display can be found in display.h and display.cpp.

The USB descriptor now additionally contains an endpoint declaration (see usb_descriptor.cpp). The endpoint address is 1, the maximum packet size is 64 bytes — the maximum length for USB full speed bulk endpoints.

static const struct usb_endpoint_descriptor comm_endpoint_descs[] = { {
.bLength = USB_DT_ENDPOINT_SIZE,
.bDescriptorType = USB_DT_ENDPOINT,
.bEndpointAddress = 1, // EP_DATA_OUT
.bmAttributes = USB_ENDPOINT_ATTR_BULK,
.wMaxPacketSize = 64,
.bInterval = 0
} };

In usb_set_config (called when the device configuration is selected), the endpoint is set up (instead of registering a callback for vendor calls):

void usb_set_config(usbd_device *usbd_dev, uint16_t wValue)
{
register_wcid_desc(usbd_dev);
usbd_ep_setup(usbd_dev, EP_DATA_OUT, USB_ENDPOINT_ATTR_BULK,
BULK_MAX_PACKET_SIZE, usb_data_received);
}

As part of the setup, a callback is registered. The callback function is called when a packet has been received. Somewhat simplified the callback looks like so (see usb_data_received):

void usb_data_received(usbd_device *usbd_dev, uint8_t ep)
{
// retrieve USB data
uint8_t packet[BULK_MAX_PACKET_SIZE];
int len = usbd_ep_read_packet(usb_device, EP_DATA_OUT,
packet, sizeof(packet));
// copy data into circular pixel buffer
buffer.add_data(packet, len);
// check buffer: if less than 2 packets fit -> NAK
if (buffer.avail_size() < MIN_FREE_SPACE)
{
usbd_ep_nak_set(usbd_dev, ep, 1);
}
}

This is the main part of flow control. usbd_ep_read_packet copies the data from USB memory to the specified buffer. It also set the endpoint into the state ready. After having copied the data to the circular buffer for pixel data, the code checks if there is space for another two USB packets. If not, the endpoint is set into the state not ready. (In STM32 MCUs, ready is called VALID and not ready is called NAK.)

The reason the code checks for 2 packets and not 1 is the unfortunate fact that LibOpenCM3 sets the endpoint to VALID in usbd_ep_read_packet. So there is small window between setting it to VALID and setting it to NAK when a next packet could arrive while the previous one is still being processed. To cater for it, it needs to be set to NAK earlier.

Host software

For the software, Python is used again. And additional module for handling images is required and needs to be installed:

pip install pillow

The Python code first reads the image from disk and converts it into a byte array of 16-bit pixels in RGB565 format. Locating the device and selecting the default configuration is the same as in the first example. Sending the data is even simpler: a single call to write with the entire byte array is sufficient (2000 is the timeout in ms):

DATA_EP = 1# load image
im = Image.open("parrot.png")
pixels = convert_rgb565(im)
# find device
dev = usb.core.find(idVendor=0xcafe, idProduct=0xceaf)
if dev is None:
raise ValueError('Device not found')
# set configuration
dev.set_configuration()
# send pixel data
dev.write(DATA_EP, pixels, 2000)

The array is 128 × 160 × 2 bytes = 40,960 bytes long. The single call to write exemplifies the strength of bulk endpoints and the underlying implementation in today’s operating systems. They take care of splitting the data into packets (64 bytes in our cases) and sending it to the USB device at the speed the device is capable of handling.

It also shows that a bulk endpoint implements a data stream and is not message oriented like a control endpoint. Even if the Python code would call write for every single byte, it would still look more or less the same on the device side: it would mostly receive 64-byte packets. This is because the computer can and will merge data in its buffers.

I hope the tutorial was helpful and will help you to implement a great device with USB communication.

--

--