Random Access Authenticated Messaging — an alternative messaging protocol for IOTA

Robin Lamberti
10 min readOct 31, 2018

--

Introduction

IOTA allows feeless and secure messaging by attaching zero-value transactions to the tangle. While data on the tangle is immutable and therefore tamper-proof, additional techniques are needed to fulfil other security requirements such as confidentiality and authentication.

Another problem is the linking of multiple messages carrying related data. A convenient way to point to multiple messages is needed instead of referencing each zero-value transaction explicitly.

Random Access Authenticated Messaging is an attempt to solve these issues by organizing messages in so called channels and using well-known IOTA-based cryptography.

Why another messaging protocol?

This is a legit question because there already is a capable messaging protocol for IOTA. Masked Authenticated Messaging also provides secure messaging and linked data channels and enables many use cases.

However, it has its limitations. For example, this can be demonstrated with the use of MAM for an audit trail use case. A simple solution would be to attach a QR code to a product at the start of production containing the seed of a MAM channel. At every step of production, the manufacturer adds a message to the channel pointing to a message in its own MAM channel. This way the authorship of the message is secured because only the manufacturer can add messages to its channel.

Everybody who scans the QR code of the finished product can audit whether it has passed all the expected steps of production with the correct manufactures.

But there is a problem. To verify that two MAM messages are belonging to the same author, a reader needs to fetch all messages from the start, because one message points to the next one. And given that a manufacturer produces for example 10000 or more products a day, all these messages need to be fetched when verifying product number 10000. Fetching so many messages takes incredibly long, since a customer normally only wants to audit one specific product and therefore doesn’t have the previous messages cached already.

Thus, a messaging protocol with indexed messages is needed. These must be accessible directly and out of order in O(1), while it’s possible to verify that they belong to the wanted channel, only by fetching this specific message.

How RAAM works

The following section gives a detailed explanation of the building blocks and cryptography RAAM uses to realize the described requirement of accessing indexed messages fast and in arbitrary order.

Message Integrity and Authentication

RAAM uses the same quantum resistant cryptography the IOTA protocol is using. Messages are signed using the hash based Winternitz signing scheme. As a hash function Kerl (Keccac) is used, which can be exchanged easily if the IOTA Curl function has been vetted. Signing ensures the integrity of a message. Only the owner of the private signing key can create a signature for the message payload that will be verified by the public verifying key. For signing different security levels are provided which use longer signatures, addresses etc.

Since the Winternitz signing scheme is a One-Time-Signature scheme, each key pair can only be used once. Using it multiple times would reveal the private key and therefore allow attackers to forge correctly signed messages with the same key pair. Therefore, a different key pair for each message in a RAAM channel is needed. While creating multiple key pairs isn’t a problem, without further methods there is no way to determine whether multiple messages in the same channel have the same author.

As a mechanism to cryptographically proof that someone who holds the private signing key for a message also holds all other keys used in the channel, RAAM uses Merkle trees.

Merkle tree as they are used in RAAM.

All key pairs for the messages of a channel are created upfront. Then the verifying keys are used as the leaves of the Merkle tree. After that the tree is built by putting two leaves together as the input for the hash function. This generated hash value is a new node in the tree which is the parent of the two leaves. Now n keys are represented by an amount of n/2 hash values. The goal is to represent all of them by a single value. This is achieved by repeating this process with the previously created hash values creating a new layer of parents, which is then used as input for the next layer and so on. This is repeated until there are only two values left forming a single value acting as the root of the Merkle tree. The value of the root is the ID of the RAAM channel. Only someone who owns all public keys can create this Merkle root. That’s because the nature of the hash function. It is impossible to recreate the input of a hash function call if only the output (e.g. the Merkle root) is known.

If a message is signed with a key that was used to build the Merkle tree with the given root, this root is the only proof needed to verify that the message author is the owner of the RAAM channel. The verifying key is part of the message to verify its integrity. To check if this key is part of the Merkle tree with the given root, all that is needed is to recreate the Merkle tree and check if the recreated and the given root are the same. For that purpose, the missing parts (hash values) of the tree that are needed to recreate it together with the verifying key are also included in the message. Again, it is impossible for an attacker to forge these missing parts in a way that they can create the same root, because of the nature of hash functions.

Message Address Generation

This means that the only information to read a channel and verify its authenticity is the Merkle root which is used as the channel ID because of that. The next challenge is to make it easy to find a certain message with a given index on the tangle. A RAAM channel can either have no message or exactly one message for a given index. For that, each message is put on a different address. This address can be computed deterministically by using only the channel ID and the index of the message. The address is computed by adding the index to the channel ID, both formatted as trytes, and then hashing this value once. This way the only information needed to read a certain message is the channel ID and the index of the message. If the whole channel should be read only the channel ID is needed and the search for messages simply starts at index 0. The channel ID is like a radio frequency with that you can receive messages, when tuned in.

The computation of a message’s address.

While the hashing is not needed it has several advantages. The first is that this way the address space can be used uniformly at random. If unhashed addresses where used all addresses used for a channel would concentrate around a single value, because the channel ID would be used as the address for the first message and adding 1 to that address would be used for the second and so on.

Another advantage is that it’s impossible to spam an address belonging to a RAAM channel if the channel ID isn’t known. Someone analyzing the tangle and finding similar messages at adjacent addresses could spam the next unused address in that “neighborhood” and therefore slowing down the process of reading these messages.

Because of the deterministic computation of an address for a message, a RAAM channel can be read and written in an arbitrary order. This allows for great flexibility and fast retrieving of messages. If only the 400th message in a channel is needed only one address needs to be queried on the tangle. That means that an arbitrary message can be retrieved in O(1).

Encryption

To protect the information in the message from unwanted readers, it is encrypted with a one-time-pad encryption mechanism. This means that a password can only be used for one time. Otherwise the encrypted messages would become insecure. For making this process convenient in usage, RAAM ensures that the password for a message depends on the index of the message. The default password is the index of the message added to the channel ID. This means that everybody who knows the channel ID can read the message. This way, the channel is public.

But it is possible to restrict the access to messages. A channel password can be set that is applied to all messages this way, that for each message with a given index the index is added to the password. Then each message has a different password it’s encrypted with, but the user only must remember one password. It’s also possible to use different passwords for each message. This way for each message a different password must be remembered. This makes fine granular access restriction possible.

Branching

RAAM allows flexible and fast access of arbitrary messages. However, that comes with a tradeoff. Because the channel ID (Merkle root) must be created before a message can be published, all key pairs must be created as well. That means that a RAAM channel can only hold a finite number of messages, depending on the height of the Merkle tree. While the key pairs can always be recreated by saving the seed of the channel, creating for instance one million key pairs takes time, because it involves a huge number of hashes. Hashing performance can be significantly improved by specialized hardware (e.g. with great projects like PiDiver) and the generated key pairs can be stored securely on disk, but the problem of limited messages remains.

The RAAM protocol aims to tackle this problem with so called branching. That means that a new channel ID can be included in a message, e.g. in the last message available in the channel. With that it is possible to create new key pairs for further messages even if the channel is already used. That makes it possible to point to a new channel from an existing channel either for creating a branch in the data stream or to continue an “exhausted” channel. The authorship of the new channel is authenticated because it was referenced by its ID in the old channel. In the first version of the protocol you need to fetch the message in the old channel containing the next channel ID to verify that and access messages with indexes beyond the initial number of messages. However, approaches to improve this process are currently under development.

Outlook

Currently, the first version of the protocol is published. It is fully usable and there is a reference implementation available on GitHub for JavaScript. Other languages such as C, Java or Python will follow. However, development of the protocol is not finished.

As mentioned previously, I’m trying out different ideas for enhancing the branching feature. One idea is to link the old Merkle tree with the new one by creating one bigger tree out of the both. The root of this tree would function as the new channel ID. Another idea is to create a normal channel and link to it by including the new channel ID in a message but publish the messages of the new channel in the address space of the old one with an offset. All these ideas aim at making the extension of the channel’s capacity easier as the number of messages grows.

If a channel was expanded by linking it to many new channels very often, you’d need to fetch all the messages pointing to a new channel to proof that these have the same author as the original channel. This would be slow. It could be made faster by using so called layers. Currently, a channel consists of exactly one layer, the message layer. You could add another layer that only contains the references to the branching channels. This is comparable to a skip list. Then you need to fetch only one message to find the channel ID of the newest linked channel, which means that you only need to fetch two messages to find your message and authenticate it.

Another idea is to create the possibility to run channels that are controlled by multiple parties. IOTA provides a very useful mechanism for that called multi-signature addresses, which is already used in flash channels today. This would enable several new use cases for example publishing documents that need to be signed by multiple business partners or product manufacturing data of audit trails.

Conclusion

RAAM is a flexible, easy to use messaging protocol using the same quantum resistant cryptography as IOTA and allows for fetching a message in O(1). Therefore, it enables messaging for a variety of use cases which need privacy and integrity for data communication. This includes M2M communication for the IoT in consumer electronics as well as in machines in industrial contexts, such as autonomous data marketplaces, supply chains, mobility and smart cities.

I would like to thank Microhash for his amazing work he does for the IOTA ecosystem and in particular Qubic Lite for which he invented IAM streams, which was a source of inspiration in developing RAAM.

Random Access Authenticated Messaging is a work in progress. Therefore, I like to invite everybody to play around with it and provide feedback to make it even better. A good start is to clone the JavaScript implementation from GitHub. If you have any questions, I’m happy to answer them on discord (crossing#3428). All feedback is much appreciated.

📝 Read this story later in Journal.

🗞 Wake up every Sunday morning to the week’s most noteworthy Tech stories, opinions, and news waiting in your inbox: Get the noteworthy newsletter >

--

--