KatWalk C2: p.3: cutting the wire

32 min readMar 23, 2024

Now, when we’ve met with the treadmill as its users. We’ve learned how games are connected to the treadmill. We’ve seen the communication between the Gateway and the Treadmill. Time to see how the treadmill itself communicates with its sensors: do we really need the connection between the platform and the PC?

How sensors send the data

What do we know about sensors? We know they are wireless. We know they have a 6-byte-long address. We know the seat comes with USB Bluetooth pre-paired dongle. We know that the box under the treadmill is called a “receiver”. This means, if something quacks like a duck — it’s a duck. Maybe a rubber one, but a duck.

How to look for ducks swimming around? My computer’s motherboard has integrated wifi and Bluetooth, but the system’s bluetooth devices don’t see anything of interest. There are apps like “Bluetooth LE Explorer”, “Bluetooth LE Lab” and so on, but they also don’t show anything much of interest. As I searched for the app, I saw many apps for the phone, and the nRF Connect somehow attracted my interest, so I installed it.

When you run the app, it shows tabs for devices that the phone is already connected to, for example:

That’s promising! Maybe with the right app it’ll be trivial and easy?

Switch to the “scanner” tab, then press the “scan” button — indeed, we see multiple “KATVR” devices:

Okay, that means yes, the sensors are Bluetooth LE devices! Let’s click on one of them. The foot sensors’ lamp switched from slow blinking to fast blinking and then to the steady light. The app shows the connection tab — but completely empty:

Strange. Disconnect, connect again, look into the logs:

Nothing. What’s going on?

Okay. Going into phone settings and enable “Snoop logs”. Connect to the phone with ADB and then run Wireshark, again, and connect to the phone:

Wireshark android connectors

VR sensor on the phone and… Yes, we do have the data! There they are:

But… Why there were errors in retrieving the parameters? But still, the data updates are there, although the app doesn’t show anything, nor in logs, nor in UI. What’s going on?

Anyway. What we learned so far:

Sensors are Bluetooth LE devices.
Sensors send the Notification for attribute 0x002E, the contents of the notification severely resemble packets we saw on the USB.
App on the phone can’t see these packets.
Listing the parameters from the sensor returns errors.

How Bluetooth LE works

A little bit amateurish review of Bluetooth LE. The Bluetooth LE has “roles” and state machines for each member of the wireless dance party. All the communication happens by sending and receiving packets on constantly changing frequencies (“frequency hopping”), when each one knows when and where to listen and where whom to send.

How does one know whom to send? That part is controlled by the GAP — “Generic Access Profile”. For direct communication, there are two roles — Peripheral and Central. Peripheral devices periodically send Advertising Data (up to 31 bytes), hopping between three dedicated announcement channels, signaling that they exist. Central roles listen on these channels (hopping between them) during the new device scanning period. If/when the central device wants, it can after receiving an advertising packet request extra data from the peripheral by sending the “Scan Response Request” to get the second information packet.

Now, once we know with whom, we may connect. There are several connection modes: as a response to Advertising Data or just send a connection request directly. Regardless of the way, the connection established defines the frequency of the communication windows and algorithm of channel hopping.

The central role takes the lead. The device with the Central role should track the communication windows and initiate the communications. For every agreed communication window, Central sends the packet over the agreed channel and waits for the response. Next window — next request at another frequency. The frequency the Central device sends the requests and how often the Peripheral must respond is established at the start of the connection, or adjusted during it. If the peripheral device doesn’t respond longer than agreed — the connection is considered dead. That way devices can find an equilibrium between update latency and battery consumption.

Peripheral devices are driven ones. They can sleep as long as the connection parameters allow. They only need to wake up to send an empty packet at the end of the allowed interval. At the same time, they can be awake and ready to send data every time central contacts them. They can inform the central that they need to switch to faster or slower connection frequency as necessary, to switch between passive wait and active data transmission.

Regardless of who leads and who is driven, during the communication window, they would have the ability to send packets in both directions. The next layer in the Bluetooth LE sandwich goes data exchange structure, GATT (working over the ATT) or L2CAP. GATT is the description of the “parameters” supported, where each parameter is identified by the GUID; and the parameters’ values (read, write, notification, indication) are done using the short 16bit handles, pointing to the exact parameter. Typical communication over the connection: connect, fetch the attributes table, then read-write-subscribe to the required parameters.

GATT Server typically runs on the Peripheral devices, but that is just a typical use case. Both sides can run GATT Servers to expose some of their parameters, both can be GATT Clients to read/write/listen. While originally born to be used in a request-answer, there is now also an ability to subscribe and notify about parameter changes via Notifications and Indications, which allows GATT Clients to subscribe to changes to the parameters, and then whenever their value changes, GATT Server will send the packet on its own. Over that mechanism also made protocols to transfer large data streams, f.e., to send firmware over the air.

L2CAP is the generalization of the ATT. In simple words, it’s just a way to create a channel where each side can send the packets when it wants. The ATT protocol under the hood works as a protocol over the L2CAP channel with fixed number 4.

How to work with Bluetooth LE

Once again, amateurish view on how to apply the knowledge above.

So, let’s imagine we need to receive updates from the sensors. Typical BLE stack for the applications (available on Windows, Linux, Android and so on) focuses on supporting the Central role and supporting both GATT Server and GATT Client, although, most of the time, you only need GATT Client API.

The communication then looks this way:

Scan for devices. The application either enables scan (when we listen for advertising packets) or we know the MAC address already.
Connect to the device. Now, that we know whom to speak to, we send connection establishing packet, when both parties take their communication address and agree on the channel sequence and frequency, packet length and so on.
Set up the encryption. When/if necessary, devices may switch to encrypted communication using one of the available methods (use the static key, use the PIN to confirm and/or derive the communication key, or other methods). The encryption level increase may be requested later.
Reading the GATT profile for the device. Whoever is the GATT Client is asking for that. Or both.
Subscription to the required parameters (packets to set Notification/Indication bit are sent).
When necessary — parameters are read/written.

Bluetooth stack from the OS does a lot of packets validations, ordering them in sequence, redelivery if necessary and so on.

That works as long, as everyone communicating is protocol-compliant… Similar to my previous learning of USB stack usage, I took the first few projects from Github — Android-BLE-Connection, BluetoothLeGatt, Android-BLE-Connect-Example.

Of course, all of them were outdated, so I’ve used my own instructions from the previous article to update them to a compilable state (with some variations of actions to update BLE stack functions). But efforts were useless: while I successfully got the parameters and subscribed to updates from other BLE sensors, the connection to KATVR sensors didn’t work at all.

I’ve known Handle for the parameter I need — that didn’t help. So I’ve enabled Logcat in Wireshark and looked for it:

I’ve read thru Android sources for both Java part and Native part.

I’ve tried a few techniques to force the stack to inject a fake GATT table using reflection to access the internals. I’ve tried to communicate with the Native part directly, using hidden IPC interfaces from Java and trying to set up a direct connection. Nada.

I’ve tried the same with Windows’ BLE stack — with the same result. Since GATT Server doesn’t give the parameters list — no way to receive any packets for GATT parameters. Despite packets being transmitted and received, client applications will never see them. No way.

So, what to do? Yes, you are right.

Make our own Bluetooth Central device

Hardware — nRF52840

Since we can’t get the data after multiple layers of far too smart OSes, that protect us from anything unexpected, we should work with something, that doesn’t pay that much attention to what we do.

Since I’ve had already nRF52840 Dongle (bought to sniff communication between the original receiver and sensors), I’ve taken a look at its capabilities. Amazingly, it’s quite tasty: multiple wireless protocols support, convenient to put directly into a USB port, large roomy megabyte for the code, a quarter of the meg for RAM, altogether sprinkled with 32-bit ARM core working at 64 MHz with FPU. This chip is way more powerful than my first personal computer.

As a cherry on the top, nRF Connect software integrates with VSCode and works right away, giving easy access to many samples that just work — sold! I don’t need to find anything else, this one is more than enough. Yes, sure, they recommend to work with nRF DevKit, but I have a dongle already! :) However, just single dongle wasn’t enough, since at some point I needed something to prove where I was wrong, so I bought a second dongle, this time Seeed Studio XIAO nRF52840. It is even more compact, it has conveniently already installed UF2 firmware loader (so you don’t need extra software to update firmware, just copy file to a drive), so in the end the project was tuned to work on it, while nRF Dongle just used as a BT Sniffer.

Software — Zephyr based

After playing a bit with the provided examples, I’ve got the idea that Zephyr OS doing its RTOS job very well — thin abstractions over the hardware and the access to anything that one may need. With a deep and well-made integration, nRF SDK makes things work right away, so one can just focus on the logic of what one needs to do. Yes, perhaps, the use of something simpler or more low level could give some benefits, but the best is the enemy of the good, as I just want to make it work :)

The Zephyr-based projects are made in a way to have an easy way to enable/disable some parts of the code, so, compared to the simple studio projects, you need to list the source files you want into `CMakeLists.txt`. The project configuration, set in the CMakeLists, in the studio settings, and in few other places, defines which board should be used, there are ways to provide additional settings for different boards if you need to support cross-compilation for different ones.

To start it is enough to know that `prj.conf` is used to set the RTOS configuration parameters, enabling and disabling the subsystems and features necessary, assuming that the base board configuration is already done. If we need some additional parameters, we can create our own `Kconfig`. The rest is not important for now.

Step 1: USB Console

Since we working not with a DevKit (which means we don’t have JTAG/console embedded — and I don’t have a JTAG tail nearby), the first task is to get a debug console for ol’ good ~~printf~~ printk debugging. Spoiler: the XIAO nRF52840 has it enabled by default, and not fully correct (not via default change in the Kconfig extension but via `prj.conf` overlay), so while for nRF Dongle one needs to make an effort to enable it, for XIAO dongle one need to make effort to disable it :)

To enable the USB Console on nRF Dongle, the first step is to bind the console to the USB-UART device in `app.overlay` file:

/*
For now, enable single USB-UART that act as a debug console.
*/

/ {
    chosen {
        zephyr,console = &cdc_acm_uart0;
    };
};

&zephyr_udc0 {
    cdc_acm_uart0: cdc_acm_uart0 {
        compatible = "zephyr,cdc-acm-uart";
    };
};

And the second step is to enable devices and subsystems for usb, serial, console and logging in `prj.conf`:

CONFIG_USB_DEVICE_STACK=y
CONFIG_USB_DEVICE_PRODUCT="nRF KAT-VR Receiver"
CONFIG_USB_DEVICE_PID=0x0004
CONFIG_USB_DEVICE_INITIALIZE_AT_BOOT=n
CONFIG_SERIAL=y
CONFIG_CONSOLE=y
CONFIG_UART_CONSOLE=y
CONFIG_UART_LINE_CTRL=y

CONFIG_LOG=y
CONFIG_LOG_PRINTK=y
CONFIG_LOG_MODE_IMMEDIATE=y

The last step is to enable initialize and enable the subsystem in `src/main.c`. It is also convenient to add `BUILD_ASSERT` to not forget about other settings, plus good to add into `main()` code to wait for an observer to connect the serial console. That wait for the observer is a way to make printk debugging more convenient: you keep “nRF Terminal” on auto-reconnect, so once new firmware flashed, you see the logs right from the start.

#include <zephyr/kernel.h>
#include <zephyr/drivers/uart.h>
#include <zephyr/usb/usb_device.h>
#include <zephyr/usb/usbd.h>
#include <zephyr/sys/printk.h>

// Ensure the console is USB-UART
BUILD_ASSERT(DT_NODE_HAS_COMPAT(DT_CHOSEN(zephyr_console), zephyr_cdc_acm_uart),
             "Console device is not ACM CDC UART device");

int main(void)
{
    int err;

    if (IS_ENABLED(CONFIG_USB_DEVICE_STACK))
    {
        const struct device *const dev = DEVICE_DT_GET(DT_CHOSEN(zephyr_console));

        err = usb_enable(NULL);
        if (err && (err != -EALREADY))
        {
            printk("Failed to enable USB");
            return err;
        }

        /* Poll if the DTR flag was set */
        uint32_t dtr = 0;
        while (!dtr)
        {
                uart_line_ctrl_get(dev, UART_LINE_CTRL_DTR, &dtr);
                /* Give CPU resources to low priority threads. */
                k_sleep(K_MSEC(100));
        }
    }

    printk("*** nRF KAT Receiver ***\n");

    printk("Initialized.\n");
    return 0;
}

Great, now, as we have a base frame, we can build our app around it. Let’s go!

Step 2: Bluetooth Central

Zephyr BLE stack can do everything I want. My sequence of thoughts: we need to find the device and connect to it. Generally speaking, scanning is not required, as all the devices are already known (they are written during the pairing process), the GATT Server is not needed, and even the GATT Client is not needed, since sensors send updates right away. Ah, and we need to support 4 connections (left, right, back, seat).

So, add to `prj.conf`:

CONFIG_BT=y
CONFIG_BT_CENTRAL=y
CONFIG_BT_GATT_CLIENT=n
CONFIG_BT_GATT_DM=n
CONFIG_BT_SCAN=n
CONFIG_BT_MAX_CONN=4

Add to `main.c`:

#include <zephyr/bluetooth/bluetooth.h>

To start, let’s just grab the first sensor lying around nearby, hardcoding its address into sources as is. MAC addresses should be stored in the network order, from lower to higher byte, so swap them manually (or use the string parsing function.. for constant? really?):

const bt_addr_le_t katDevices[] = {
        // KAT_DIR
        {.type=BT_ADDR_LE_PUBLIC, .a={.val={0x01, 0x74, 0xEB, 0x16, 0x4D, 0xAC}}}, // AC:4D:16:EB:74:01
}

Next step: add two callbacks: one for the connection, and another for the disconnect. Connection is only possible when the device is nearby and it can disappear any time if you walk too far away or device goes to sleep or anything else happens. So, we need both, to reconnect if the connection will die.

static void device_connected(struct bt_conn *conn, uint8_t conn_err)
{
    if (!conn_err) {
        int conn_num = bt_conn_index(conn);
        printk("Device connected (%p/%d)\n", conn, conn_num);
    } else {
        printk("Connect error (%p/%x)\n", conn, conn_err);
        bt_conn_unref(conn);
    }
}

static void device_disconnected(struct bt_conn *conn, uint8_t reason)
{
    printk("Device disconnected (%p/0x%02x)\n", conn, reason);
}

BT_CONN_CB_DEFINE(conn_callbacks) = {
    .connected = device_connected,
    .disconnected = device_disconnected,
};

Regarding the `BT_CONN_CB_DEFINE`. Zephyr OS supports for almost every subsystem two ways to initialize them up: static and dynamic. Dynamic initialization requires allocating memory somehow, filling the data structure and then calling a function. The static initialization is defined with a fully static declaration of the structure data using special helper macros. The macro defines a variable with a special name; the linker collects all such variables in one place and the OS runs required initializations during the startup without the need to write extra code. That makes it possible to write very short simple programs and yet have full freedom to do whatever you want beyond the pre-defined OS capabilities. For the Bluetooth connections, one can use `BT_CONN_CB_DEFINE` to define the fixed structure with the fixed pointers to the fixed callback handlers, or one can define and fill the `bt_conn_cb` structure and pass it down to the `bt_conn_cb_register` function to enable or `bt_conn_cb_unregister` to disable callbacks later. We don’t need to be able to disable them later, so use the static initialization way.

The next steps are to initialize the BLE stack and to start connecting to the sensor.

static struct bt_conn *default_conn;

int main(void)
{
    /// ...
    printk("*** nRF KAT Receiver ***\n");
    err = bt_enable(NULL);
    if (err)
    {
        printk("Bluetooth init failed (err %d)\n", err);
        return 0;
    }

    printk("Bluetooth initialized\n");
    err = bt_conn_le_create(&katDevices[0], BT_CONN_LE_CREATE_CONN,
            BT_LE_CONN_PARAM_DEFAULT, &default_conn);
    if (err) {
        printk("Create conn failed (%d)\n", err);
    }
}

Done, if we flush the firmware now we can see how the application connects to the sensor and loses connection if the sensor gets restarted.

Good, but we need to connect to multiple devices, and it would be convenient to not write the code to connect with them all one by one. Sure, we know that all devices are at once nearby and eager to get connected, but still, it would be better to let the lower level take care of that.

There are two ways to do so. The first way is to run a scan, and check the connection whenever the device is found; the second way is to use a passive scan ability that automatically connects to the discovered device if it passes the filter check. Worth mentioning that search (regardless of active or passive) is a time when the radio switches to advertising channels and listens for a while for the packets.

So, automatic wait and connect works this way:

Enable via `prj.conf` support for `CONFIG_BT_FILTER_ACCEPT_LIST=y`
Set MAC addresses of interest via `bt_le_filter_accept_list_add`.
Start automatic connection via `bt_conn_le_create_auto`.

Done, once the first device get into the connectivity radius — it gets connected. Once the connection is established, we just need to run `bt_conn_le_create_auto` again to get the next device connected.

In contrast to the direct connection with `bt_conn_le_create`, we won’t get the connection structure, we’ll see it for the first time only in the `connected()` callback, but we don’t need this structure anyway. Auto-connection parameters define how frequently and for how long the device switches to the passive scanning mode. If you need to support both, scan for new connections and keep existing connections active, you should tweak the parameters accordingly. For example, set up the newly established connection and get all the required parameters before scanning for the next connection. Plus, don’t forget to count the time slots required (see more about it in the steps below).

Step 3: Get ATT packets without the GATT client

And now we’ve got to the point of interest. We have a connection, but where are the packets?! I can’t see anything even enabling logging for everything related in `prj.conf`:

CONFIG_BT_A2DP_LOG_LEVEL_DBG=y
CONFIG_BT_BAS_LOG_LEVEL_DBG=y
CONFIG_BT_DF_LOG_LEVEL_DBG=y
CONFIG_BT_HCI_DRIVER_LOG_LEVEL_DBG=y
CONFIG_BT_LOG_LEVEL_DBG=y
CONFIG_BT_RFCOMM_LOG_LEVEL_DBG=y
CONFIG_BT_LOG_SNIFFER_INFO=y
CONFIG_BT_ATT_LOG_LEVEL_DBG=y
CONFIG_BT_GATT_LOG_LEVEL_DBG=y
CONFIG_BT_CONN_LOG_LEVEL_DBG=y
CONFIG_BT_HCI_CORE_LOG_LEVEL_DBG=y
CONFIG_BT_L2CAP_LOG_LEVEL_DBG=y

No packets. That was the moment when I gave up eventually and bought a second dongle (XIAO one) and spent time comparing the differences between the original receiver and my receiver. Between the phone connection and my receiver. Eventually, I’ve got down to the reason by excluding everything else: the stream starts only after the central role (my receiver) sends the packet to change the connection parameters. And yes, once I’ve added to the `connected()` callback the call to `bt_conn_le_param_updated` with trivial change (I just change connection timeout duration) — the sensor starts streaming the updates.

To receive that stream, though, I had to go a little down into the OS code. The reason: the GATT Client is a kind of embedded feature, that works over the fixed L2CAP channel number, and, of course, does much more than what I need. All I need is just raw packets, as is, without any parsing or preprocessing. At the same time, L2CAP itself assumes that the channel is not something fixed but rather agreed on. So, I’ve gone into OS code and cloned code/macro to make a static channel:

#ifndef BT_L2CAP_CHANNEL_DEFINE
// Include l2cap_internal.h if you build in-tree; otherwise, let's just steal definitions from compatible (v2.5.2)
struct bt_l2cap_fixed_chan {
    uint16_t cid;
    int (*accept)(struct bt_conn *conn, struct bt_l2cap_chan **chan);
    bt_l2cap_chan_destroy_t destroy;
};
#define BT_L2CAP_CHANNEL_DEFINE(_name, _cid, _accept, _destroy)         \
    const STRUCT_SECTION_ITERABLE(bt_l2cap_fixed_chan, _name) = {   \
        .cid = _cid,                            \
        .accept = _accept,                      \
        .destroy = _destroy,                    \
    }
#define BT_L2CAP_CID_ATT                0x0004
#endif

BT_L2CAP_CHANNEL_DEFINE(a_att_fixed_chan, BT_L2CAP_CID_ATT, my_att_accept, NULL);

The idea is simple: we make a fixed channel and name it in a way that it definitely gets processed before the GATT Client’s (in case we’ll enable it for some reason). At the same time, the GATT Client already tries to be last in the initialization, which makes everyone happy. Using this way I still have freedom to enable GATT Client if I need support standard connection as well.

For each of the established connections, L2CAP callback will need to set the callbacks to handle messages for this channel for this particular connection. The callback should check is the connection one that necessary (for me — any connection is good), then create the structure with the callbacks specific to that connection. Once again we can benefit from working near the hardware on the RTOS: we don’t need memory allocation here. We can have only up to CONFIG_BT_MAX_CONN connections, and for any connection we can trivially and cheaply get its index using `bt_conn_index`. That makes memory management as trivial as having one static array. That’s our L2CAP callback handler:

static struct bt_l2cap_le_chan my_att_chan_pool[CONFIG_BT_MAX_CONN];
static int my_att_accept(struct bt_conn *conn, struct bt_l2cap_chan **ch)
{
    static const struct bt_l2cap_chan_ops ops = {
        .recv = my_att_recv,
    };

    int id = bt_conn_index(conn);
    printk("Capturing L2CAP ATT channel on connection %p/%d\n", conn, id);

    struct bt_l2cap_le_chan *chan = &my_att_chan_pool[id];
    chan->chan.ops = &ops;
    *ch = &chan->chan;
    return 0;
}

The callback is called during connection establishment every time a new connection is set up. And any packet sent over this L2CAP channel get sent to our callback handler:

static int my_att_recv(struct bt_l2cap_chan *chan, struct net_buf *req_buf)
{
    int id = bt_conn_index(chan->conn);

    printk("Received packet for conn %d [%d bytes]:", id, req_buf->len);
    for (int i = 0; i<req_buf->len; ++i) {
            printk(" %02x", req_buf->data[i]);
    }
    printk("\n");

    return 0;
}

Done, we’ve got our updates stream, and we can do whatever we want with this data: set up arrays to handle the current sensors state, write functions to parse feet sensors data and parse direction sensors data.

Step 4: USB HID composite device to mimic the original receiver

Since we succeeded in obtaining the data, we should send the data to the computer. The simplest way is to mimic the original receiver. This way, we’ll make something useful to others: users will be able to use their VR Treadmill without the need to have a cable across the room. Or, we can make “real native games” that will only need tiny dongle attached to the headset to play wireless. Basically, only lots of benefits with minimal work.

To mimic the original dongle we should (a) clone its VID/PID (which is trivial), (b) clone HID descriptor (a little less trivial), (c) clone the protocol.

To clone the CID/PID we just set them in the `prj.conf`. To let the gateway see the clone as the original device and doesn’t tell that it found second device — let’s clone also the serial number.

CONFIG_USB_DEVICE_VID=0xC4F4
CONFIG_USB_DEVICE_PID=0x2F37
CONFIG_USB_DEVICE_SN="CRA21D60xxx"

Side note here. Zephyr OS by default uses device-unique serial numbers, based on the chip-unique data. But when I tried to use the default serial number KAT Gateway just silently crashed. By debugging in the IDA (running KATDevideSDK via rundll) I’ve found out that the reason is due to a too-long serial number: the library used by the gateway expects the serial number to be up to 12 characters and had no length checks. So I set up my serial number, and that worked well to show the serial as the original receiver, but the funny thing is that it is a bug, which I didn’t knew until the very end! Generally speaking, just setting the serial in `prj.conf` shouldn’t work :)

Cloning the HID descriptor is less trivial, but also simple enough. We should enable an HID device and set the required settings for HID communication in the `prj.conf` (I spent a while trying to understand why it didn’t work at first — it was because the default packet size is 16 bytes). So we need one HID device, the interrupts for outgoing and incoming packets. Since we still need a serial console for the debugging, we should also enable composite device support. The composite device is a technique for single different USB devices to be implemented in the one device and OS (Win, Lin, Android, etc) split into separate parts and use standard drivers for each of them, so the applications and drivers don’t need to have special handling for the composite case and device doesn’t need to mimic being USB hub to expose them.

CONFIG_USB_DEVICE_HID=y
CONFIG_USB_HID_DEVICE_COUNT=1
CONFIG_HID_INTERRUPT_EP_MPS=64
CONFIG_ENABLE_HID_INT_OUT_EP=y
CONFIG_USB_COMPOSITE_DEVICE=y
CONFIG_USB_DEVICE_INITIALIZE_AT_BOOT=n

To get the original HID descriptor we have several ways, as usual, — take win-hid-dump to get this one:

C4F4:2F37: KATVR - walk c2 receiver
PATH:\\?\hid#vid_c4f4&pid_2f37#8&a136e90&0&0000#{4d1e55b2-f16f-11cf-88cb-001111000030}
DESCRIPTOR:
  06  A0  FF  09  01  A1  01  09  01  15  00  25  FF  35  00  45
  00  65  00  55  00  75  08  95  20  81  02  09  02  91  02  09
  03  95  05  B1  02  C1  00
  (39 bytes)

Or attach the USB device to the WSL2 Linux by running:

> winget install --interactive --exact dorssel.usbipd-win
> usbipd bind --hardware-id c4f4:2f37
> usbipd attach --wsl -i c4f4:2f37

And then from the WSL shell run usbhid-dump:

$ sudo usbhid-dump 
001:002:000:DESCRIPTOR         1710781357.084389
 06 A0 FF 09 01 A1 01 09 01 15 00 26 FF 00 75 08
 95 20 81 02 09 02 75 08 95 20 91 02 09 03 75 08
 95 05 B1 02 C0

Both ways will get similar descriptors, although win-hid-dump dumps are reconstructed from windows USB device cache data, while usbhid-dump prints the exact descriptor as the device sent over the wire.

Regardless of the way of obtaining the descriptor, the data is fed into USB Descriptor and Request Parser and used to reconstruct the descriptor:

static const uint8_t hid_report_desc[] = {
    HID_ITEM(HID_ITEM_TAG_USAGE_PAGE, HID_ITEM_TYPE_GLOBAL, 2),
    0xA0,
    0xFF,                                       // Usage Page (Vendor Defined 0xFFA0)
    HID_USAGE(0x01),                            // Usage (0x01)
    HID_COLLECTION(HID_COLLECTION_APPLICATION), // Collection (Application)
    HID_USAGE(0x01),                            //   Usage (0x01)
    HID_LOGICAL_MIN8(0x00),                     //   Logical Minimum (0)
    HID_LOGICAL_MAX16(0xFF, 0x00),              //   Logical Maximum (0xFF)
    HID_REPORT_SIZE(8),                         //   Report Size (8)
    HID_REPORT_COUNT(32),                       //   Report Count (32)
    HID_INPUT(0x02),                            //   Input (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position)
    HID_USAGE(0x02),                            //   Usage (0x02)
    HID_REPORT_SIZE(8),                         //   Report Size (8)
    HID_REPORT_COUNT(32),                       //   Report Count (32)
    HID_OUTPUT(0x02),                           //   Output (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position,Non-volatile)
    HID_USAGE(0x03),                            //   Usage (0x03)
    HID_REPORT_SIZE(8),                         //   Report Size (8)
    HID_REPORT_COUNT(5),                        //   Report Count (5) -- why?!
    HID_FEATURE(0x02),                          //   Feature (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position,Non-volatile)
    HID_END_COLLECTION,
};

As far as I understood, Zephyr as of today has two USB drivers: one is deprecated (well, frozen in features) and one is under construction (and thus unstable). That means the old one has some rough edges (f.e. doesn’t have static HID initialization) and the new one may break at any time so I didn’t dig into. That works to set up the HID device dynamically:

static const struct device *hiddev;
static const struct hid_ops usb_ops = {
    .int_in_ready = int_in_ready_cb,
    .int_out_ready = int_out_ready_cb,
};

int start_usb(void)
{
    hiddev = device_get_binding("HID_0");
    if (hiddev == NULL)
    {
        return -ENODEV;
    }

    usb_hid_register_device(hiddev, hid_report_desc, sizeof(hid_report_desc), &usb_ops);

    int err = usb_hid_init(hiddev);
    if (err)
    {
        printk("usb_hid_init failed: %d\n", err);
        return err;
    }

    return usb_enable(NULL /*status_cb*/);
}

Once again, you should have `CONFIG_USB_DEVICE_INITIALIZE_AT_BOOT=n` setting to make sure the USB device won’t be initialized too early to set up the USB console, which will prevent the HID set up from working properly.

The `usb_ops` struct sets the callbacks, which we need two: one for “packet has been sent” and one for “packet received”. We won’t need anything else for our case. As we’ve discovered in the last part, the whole communication works over URB_INTERRUPT packets, so we don’t need to learn anything special.

Let’s construct the diagram of USB communication states. We don’t need any queues: we either received a packet (and may respond to it), either sent something (and may continue sending), or we are free and may send something whenever we want. We can only receive commands from the gateway, some require answer others not. To send we’ll either send a command reply or sensors update data.

Thus, the whole logic is:

The packet arrived: process it.
\ If the packet requires answer: send the answer, if the sender is free, buffer the answer otherwise.
The packet sent: check if there an answer pending.
\ If the answer is pending: send it.
\ If no answer is pending: check, do we have fresh sensor data, send if yes.
\ If there is no fresh data, we are free.
Upon new sensors’ data arrival, check is sender is free, send them if yes.

So we only need buffer space for only 1 command; since we only need to answer incoming commands, we can just reuse the input buffer.

So, the final USB infrastructure logic would be:

static void usb_write_and_forget(const struct device * dev, void *buf)
{
    int wrote = -1;
    hid_int_ep_write(dev, buf, KAT_USB_PACKET_LEN, &wrote); // feeling lucky
    if (wrote != KAT_USB_PACKET_LEN)
    {
        // This shouldn't happen. To avoid hanging USB, release it and keep trying other time.
        atomic_clear_bit(&usbbusy, cUsbOutBusy);
        printk("Send output bug: wrote only %d bytes instead of %d.\n", wrote, KAT_USB_PACKET_LEN);
    }
}

static void usb_send_or_queue(const struct device * dev, void *buf)
{
    // Check old business status, send if were free.
    if (!atomic_test_and_set_bit(&usbbusy, cUsbOutBusy))
    {
        usb_write_and_forget(dev, buf);
    }
    else
    {
        // Queue the buffer till the next time.
        if (!atomic_ptr_cas(&usb_queue, NULL, buf))
        {
            printk("Output queue is busy. Packet is lost. [should not happen (tm)]");
        }
    }
}

static void int_in_ready_cb(const struct device *dev)
{
    // We enter assuming we own the business lock.
    // If there queued packed to send -- send it now.
    atomic_ptr_val_t queue = atomic_ptr_clear(&usb_queue);
    if (queue)
    {
        usb_write_and_forget(dev, queue);
    }
    else
    {
        // if there is no queued packets - try to send fresh update.
        if (!send_update_packet(dev)) {
            // If there was nothing to send -- we clear out busy signal.
            atomic_clear_bit(&usbbusy, cUsbOutBusy);
        }
    }
}

tKatUsbBuf usb_command_buf;
static void int_out_ready_cb(const struct device *dev)
{
    int read = -1;
    int err = hid_int_ep_read(dev, usb_command_buf, sizeof(usb_command_buf), &read);
    if (!err) {
        if (handle_kat_usb(usb_command_buf, read))
        {
            usb_send_or_queue(dev, usb_command_buf);
        }
    }
}

Over this structure we build trivial protocol handler and not any less trivial sensor encoders.

Step 5: tweak timings

Now we have everything up and running, but it doesn’t work — the gateway complains that there is no connection, a bits of data are shown very slow and laggy.

First of all, we have slow USB communication. By default USB polling setup to 9ms, which means we have 111 packets per second since we don’t group or buffer data to send. That problem is fixed with a simple

CONFIG_USB_HID_POLL_INTERVAL_MS=1

which brings USB transfers up to 1000 pps.

And the second is the BLE communication is slow. There are multiple reasons for this.

The BLE itself (at 1Mbit PHY) works with time slots rounded to a multiple of 1.25ms. Whenever we set “Connection Interval” into N units, it’s the frequency for

each communication window. It is set in a number of 1.25ms between the start of communication windows, and the minimum setting is 6 (once every 7.5ms).

The communication itself also requires time, at least one unit (actually, it’s a little more complicated). For the KATVR sensors communication case, despite them asking for 250 bytes MTU, the actual communication is one packet of ~0x20 bytes. so all we need during the window is to send the tiny packet towards the sensor and receive the answer packet; let’s say, that’s 1 unit there and 1 unit back. So, we have 1000/1.25 = 800 units per second, 2 for communication with each sensor, 3 sensors, which gives us 800/3/2 = 133 Hz communication limit. To get 133 Hz we should set the Connection Window to just 6 units (2 units for the sensor, 4 units gap for two other sensors) and then shorten the communication window from default 7.5ms. The minimum communication window size is 1 unit (1250), but to get stable communication I had to set it higher — it got stable from around 1500–1600. To get it more leeway room i’ve set it to 2ms:

#prj.conf
CONFIG_BT_CTLR_SDC_MAX_CONN_EVENT_LEN_DEFAULT=2000

So the connection parameters are:

static const struct bt_le_conn_param btConnParam = BT_LE_CONN_PARAM_INIT(6, 6, 0, 2000);

It is still worth keeping in memory that until we have a connection established to all required sensors, we need also time windows to scan for devices and to establish connections. That’s not important for the main streaming time when all are connected and send data, but it does affect the situation while not all sensors yet connected.

Unfortunately, at 133Hz sensors start to slip: skip some time slots without sending the data (that actually was fixed by increasing the connection window length to 2000) and sometimes (more precisely — quite often) they start packets with zero deltas. That leads to a non-usable situation when you move the sensor evenly but it suddenly gets zero as one of the packets, breaking computations.

As an interim solution… I’ve reduced the update frequency down to 100Hz. At 100Hz communication got stable. To get 100Hz we should update each sensor every `1000/1.25/100 = 8` units.

static const struct bt_le_conn_param btConnParam = BT_LE_CONN_PARAM_INIT(8, 9, 0, 2000);

Step 6: application settings

The KAT’s pairing process between receiver and sensors works by sending over USB commands to write MAC addresses for the sensors, which means time to replace the hardcoded list of sensors and replace it with settings sent over USB. Of course, it is inconvenient to pair every time (one should connect sensors one by one to the computer with a cable, which requires unscrewing the backplate sensor out of the console…), so we should support storage for the settings.

Zephyr, as one can imagine, has support for storing settings with several backends, which can use flash memory as well.

So all we need is to enable all required dependencies: settings subsystem itself, settings in non-volatile storage (NVS), a driver for NVS, support write into flash and chip-specific flash regions memory map which explains where to write the settings.

CONFIG_NVS=y
CONFIG_FLASH=y
CONFIG_FLASH_MAP=y
CONFIG_SETTINGS=y
CONFIG_SETTINGS_NVS=y

Enabling all of these settings will enable everything required and as long as the board has correct description, you don’t need much else to do to get the settings subsystem working.

In a typical for the Zephyr OS way, there are a dynamic and a static setup way for settings, again, for us it’s fine to use the static one:

SETTINGS_STATIC_HANDLER_DEFINE(
    katrc, "katrc",
    /*get=*/NULL,
    /*set=*/katreceiver_settings_set,
    /*commit=*/NULL,
    /*export=*/katreceiver_settings_export);

The settings API provides multiple usages depending on one’s scenario. All settings at once can be saved by calling `settings_save()`, which will run a configured `export` callback to save all the settings; or only required settings can be saved where needed by calling `settings_save_one()`. Load of settings happens by calling `settings_load()` which will scan configured storage regions and call `set` callback for the very last copy of every found setting in storage; once scan of the storage region is complete, the `commit` will be called to confirm that all settings are loaded. In our case we only need to load settings at the start, so the definition of `set` and `export` callbacks is sufficient.

All settings are stored as key=>id=>value mapping, where the key is the parameter name string, and the id is system-generated unique handle. The flash stores a separate table for the key=>id mapping, and separately stores id=>value mapping. The settings driver takes care of getting only the latest value set for the id, and append values only if they have changed from the last time, and if we reach the end of the flash sector to collect and move all the latest versions to the next one before erasing the current sector. Basically, everything is taken care of.

While it is kind of well-thought, there are a few inconveniences caused by the asymmetry of write and read: `set` function (to load the value) is called with the name where the prefix is already cut off (“katrc” set up in the struct), but `export` function should use full path for each of the names.

So the `export` handler looks like this:

int katreceiver_settings_export(int(*export_func)(const char *name, const void *val, size_t val_len))
{
    int ret;

    ret = export_func("katrc/devCnt", &numKatDevices, sizeof(numKatDevices));
    if (ret < 0) return ret;

    char argstr[100] = "katrc/dev/";
    char * argsuffix = &argstr[strlen(argstr)]; // argsuffix now is the pointer beyond "/"
    for (int dev = 0; dev < numKatDevices; ++dev) {
        sprintf(argsuffix, "%d", dev); // argstr now katrc/dev/N
        ret = export_func(argstr, &katDevices[dev].a, sizeof(katDevices[dev].a));
        if (ret < 0) return ret;
    }

    return 0;
}

Where we just call the export_func (provided by the currently active settings handling subsystem) for each of the settings. While reading callback like that:

int katreceiver_settings_set(const char *key, size_t len, settings_read_cb read_cb, void *cb_arg)
{
    const char *next;
    int ret;

    if (settings_name_steq(key, "devCnt", &next) && !next) {
        if (len != sizeof(numKatDevices)) {
            return -EINVAL;
        }

        ret = read_cb(cb_arg, &numKatDevices, sizeof(numKatDevices));
        if (ret < 0) {
            numKatDevices = 0;
            return ret;
        }
        return 0;
    }

    if (settings_name_steq(key, "dev", &next) && next) {
        int dev = atoi(next);

        if (dev < 0 || dev > ARRAY_SIZE(katDevices)) {
            return -ENOENT;
        }

        if (len != sizeof(katDevices[dev].a)) {
            return -EINVAL;
        }

        katDevices[dev].type = BT_ADDR_LE_PUBLIC; // Well, it's just zero, so noop.
        ret = read_cb(cb_arg, &katDevices[dev].a, sizeof(katDevices[dev].a));
        if (ret < 0) {
            memset(&katDevices[dev].a, 0, sizeof(katDevices[dev].a));
            return ret;
        }
        return 0;
    }

    return -ENOENT;
}

The read callback is expected to handle every parameter found in the storage. The callback is called with the latest available settings value, and name of the setting, but the name will already have the prefix cut off. I understand how it got that way: that easier to implement (that prefix is used to find the right callback structure), also it saves a few CPU cycles to avoid extra context passing and a lot of CPU cycles to avoid strings recombination — that’s why it was implemented that way — but programmer should keep that in mind.

To work with parameter names there are provided functions somewhat similar to “strtok”: you can check for parameter name prefix equivalence, is it a leaf (in the code above first branch checks that parameter name exactly `katvr/devCnt`) or transient node/directory (the second branch checks that parameter is `katvr/dev/` with something after the slash).

As a good practice it is worth checking that parameter value size matches the expectation (in case over time code changes but data stored in the flash is not).

Since some of Zephyr OS modules also store their parameters in settings (f.e. if you use Bluetooth pairing, they will save the agreed encryption key there) the `settings_subsys_init()` (which loads up and sets up the driver to handle settings) and `settings_load()` (to load parameters themselves) should be called after all of the other subsystems are initialized and ready to take the final tweaks.

The minor things required (like, “disconnect from sensors”, “disable autoconnect”, “reset connection filter”, “load filters” etc) the reader can see in the source code.

Step 7: make it user-friendly

While I am free to even put the serial of my sensors using cuneiform on the sensors’ board, nobody else will do that as it’s plain stupid and inconvenient. Nobody will compile their own firmware for sure.

Let’s collect the UX requirements:

The device should work instantly once gets connected over USB (no wait for a serial console for the end user for sure)
It would be great to avoid the re-pairing procedure:
\ C2/C2+ has the same PID, but C2Core has a different PID (but the same protocol and sensors)
\ The device should have the same serial as users’ original receiver (so the gateway doesn’t distinguish them)
\ The device should be able to get current pairing data
It should be trivial to make the initial flashing and configuration.

Removal of debugging settings was a little harder than I thought. The main reason — the Seeeds board description enables them via overlay for `prj.conf`. So my attempt to override defaults via Kconfig failed: `SERIAL`, `CONSOLE`, and other settings all remain enabled.

So as a solution, I’ve put into `prj.conf` fixed release settings and to support debug versions, added teh second `prj-debug.conf` which could be added as an extra overlay. That is a minor inconvenience and it works.

I’ve spent some time to find a way to support C2Core by updating devices’ USB PID dynamically. The old USB driver doesn’t support that and I don’t want to dig into the new driver unless I must. Since I already had a debug overlay, I’ve just made a separate C2Core overlay and wrote for my convenience tiny release build script.

As I’ve mentioned already earlier, the USB driver supports updating the serial number during the load. But it only works once and is executed far too early, during driver initialization, even before the settings load. Additionally, I would like to be able to change the serial number of the fly, preferably without resetting of the USB connection to not lose the debug console.

And I did this.

To make this happen, I had to copy some bits of logic out of the driver: the old driver is relying on the fixed sequence of the descriptors, so I traverse it as well in the same way, finding the required descriptor by its index and copy ASCII string to UTF-16 string. Done. Yes, that doesn’t change OS cache (so the UsbTreeView doesn’t show updated serial number), but API functions to work with HID devices send requests to get the descriptors every time, so they return actual information. That way dynamically updated structures made the gateway to see my receiver as the original.

Additionally, the dongle needs an ability to change the serial and a way to update the serial itself. If you remember, last time in the USB protocol discovery, there was “SetSN” function (command 7), which is the way to set it. Now we need something to send this command to the receiver, as the gateway doesn’t have this functionality. Luckily, the gateway is written in C#, and PowerShell is basically the REPL loop for C#, which made it possible to set both, the serial and the pairing information via script reusing HID communication functions of Gateway:

Add-Type -Path "C:\Program Files (x86)\KAT Gateway\IBizLibrary.dll"
[IBizLibrary.KATSDKInterfaceHelper].GetMethod('GetDeviceConnectionStatus').invoke($null, $null)
$dev = New-Object IBizLibrary.KATSDKInterfaceHelper+KATModels
[IBizLibrary.KATSDKInterfaceHelper]::GetDevicesDesc([ref]$dev, 0)
[IBizLibrary.ComUtility]::KatDevice='walk_c2'
[IBizLibrary.Configs].GetMethod('C2ReceiverPairingInfoRead').invoke($null, $null)

$newSn = [IBizLibrary.KATSDKInterfaceHelper]::receiverPairingInfoSave.ReceiverSN
[byte[]]$newSnArr = $newSn.ToCharArray()
[byte[]]$ans = New-Object byte[] 32
[byte[]]$command = 0x20,0x1f,0x55,0xAA,0x00,0x00,0x07,0x00 + $newSnArr + $ans
[IBizLibrary.KATSDKInterfaceHelper]::SendHIDCommand($dev.serialNumber, $command, 32, $ans, 32)

[IBizLibrary.KATSDKInterfaceHelper]::GetDevicesDesc([ref]$dev, 0)

$pairing=[IBizLibrary.KATSDKInterfaceHelper]::receiverPairingInfoSave.ReceiverPairingByte
[byte[]]$ans = New-Object byte[] 32
[byte[]]$command = 0x20,0x1f,0x55,0xAA,0x00,0x00,0x20,0x00 + $pairing + $ans
[IBizLibrary.KATSDKInterfaceHelper]::SendHIDCommand($dev.serialNumber, $command, 32, $ans, 32)

With a few dozen checks and validations, plus messages for the user I’ve made a simple way to clone the original receiver settings.

To make the last step, that script was extended with the detection of the user treadmill (C2/C2Core) and to copy correct firmware from the release to the exposed UF2 flashing drive before calling settings upload functions.

There I learned that Powershell is too Power-ful, to the degree of being disabled by default in the system so the user can’t just run it with double-click. The extra proxy cmd script was made, which will run powershell script. Yes, you can’t run powershell script directly, but you can call a cmd script that will call powershell script. I don’t see how is that secure, but it works for me.

As a result, the release package now contains two firmware binaries, `README.txt`, `install.cmd` to run and an installation powershell script. A paranoid user can just read all installation scripts before running them.

Even more paranoid users are welcome to read whole sources, there is not much of them :)

Step 8: make it developer-friendly

… At this point source code has become a spaghetti. We have packets processing there, here settings, one part calls another, using a mix of pointers and global structures. It’s a mess. However, when software was made another way?

“What’s common between software and sausages? You better to not know how they made.”

So couple more weekends I’ve spent to the 2nd top software engineer activity (right after making something on their own from scratch) — refactoring.

I think. the result is good enough and well-balanced on the responsibility division and minimal erosion between borders:

All `kat_ble*` files contain BLE processing and the `kat_ble.c` only depends from `kat_main.h` and `kat_ble_pack.c`, the only external dependency on the USB part is `kat_usb_try_send_update()` call to send the update if we are free. I could make a thin connector inside of the `main()` but I don’t think it is worth it.
All `kat_usb*` files contain USB processing, and in a similar way `kat_usb.c` only depends on `kat_main.h` and other `kat_usb*c` files; the only external dependencies on the BLE part is `kat_ble_get_localaddr` and `kat_ble_update_devices` to handle configuration part of the protocol.
The USB serial update hack moved into its own file `kat_usb_serialno.c`, so when it gets broken by Zephyr upgrade and migration to a new USB driver, it’ll be easy to find where and replace the logic to make it compatible.
`main.c` contains the initialization of everything and handling of the settings.

To support getting help from the beta testers to see what exactly went wrong, I’ve found draft terminal code that with little polishing made to the convenient cmd script.

Now all I need is to send the debug firmware binary and ask to: double-click the reset button, copy the debug firmware, run the “serial-log.cmd” and send me a screenshot or copy text out of the window to me. Very convenient to find out where I was wrong :)

Step 9: Profit!

Viola, it’s alive! Not just alive, it also has lower input latency than the original receiver.

Check it yourself.

You should be able to see for yourself which one is mine and which one is original :)

Utopia Machine made a new video for the topic.

BTW: Many thanks for his testing!

What’s next

It’s a shame that while I’ve got 133 Hz update from sensors, it was quite unstable — skipping packets, empty packets. That means that I’ve got to release a 100 Hz version only, as 133 Hz is impossible to use. Although… Is it impossible? They work for a while!

So, yes, next week we’ll see what’s inside the sensors, how to read their firmware and how to fix the issue that causing packet loss or zero packets. We’ll also learn how to make useless binary patches and how to implement diff/patch without extra tools than ones that are already in the system.

Links

KatWalk C2 Saga, EP1: “playing actually” on [Habr], [Medium] and [LinkedIn
KatWalk C2 Saga, EP2: “to speak alien” on [Habr], [Medium] and [LinkedIn].
KatWalk C2 Saga, EP3: “cutting the wire” on [Habr], [Medium] and [LinkedIn].
KatWalk C2 Saga, EP4: “playing with firmware” on [Habr], [Medium] and [LinkedIn].
KatWalk C2 Saga, EP5: “overclocking and bugfixing” on [Habr], [Medium] and [LinkedIn].

KatWalk C2: p.3: cutting the wire

Written by Anton Fedorov