I was approached by the IOTA Foundation (IF) to develop the IOTA Ecosystem Grant/Donation Tracker [https://transparency.iota.org/]. Developing a green field project for a new technology is one of the best parts of being a developer, needless to say I jumped at the chance. That’s not to say it doesn’t have its challenges, but, having already developed the IOTA Pico libraries [https://github.com/iota-pico] I had a pretty good understanding of the technology.
The essence of the tracker was to be able to provide a completely transparent and auditable view of the greenlit grants and donations received by the IOTA Ecosystem Development Fund. This data could have been stored in a regular database but given that the tracker is for IF it makes sense to utilise their own technology, the Tangle.
Developing on the Tangle produces its own challenges, one of which is that any data that you add cannot be changed or deleted, and with the existence of perma-nodes it is there ad infinitum.
Storage on the Tangle — Single Address
The first approach was to use a single IOTA address and just attach transactions with a JSON payload to that address. If we want to edit the data we could have a unique id within the JSON payload and supersede it with an UPDATE operation. The same is true for a delete, given a unique id the JSON payload could contain a DELETE operation.
With this approach we could see the complete audit trail for the data by walking through all the transactions chronologically.
One drawback with this approach is that to get a quick view of the current state of the tracker we must always walk all the data, which over time could become very slow.
The second and larger problem with this approach is that because we are using zero value transactions to attach the data it would be very easy for a nefarious individual to spam the address with their own transactions. We could add a signature to the data to verify they were our transactions. But the fact that we need to read all the transactions on the address means we would still have to parse the malicious data, this would essentially be like a DOS attack on the address.
Needless to say, this method was not the answer.
Storage on the Tangle — MAM (Masked Authentication Messaging)
The next idea was to use IOTA’s own MAM (Masked Authentication Messaging). I will not go into a detailed explanation of the technology here, but you can read [https://medium.com/@abmushi/iota-mam-eloquently-explained-d7505863b413] for a better understanding.
Using a MAM channel with public access would appear to be the solution to our problems. The data is stored as a message chain, each message on a different address, with data in each message leading to the next. The data is also guaranteed to be from IF because of the way the data is signed and encrypted.
This approach looks like a good solution as it solves the problem of the spam attacks and verifying the data.
All our data is maintained in the channel for audit purposes, but we would still need to walk the whole chain from to get current state of the tracker. Also, we need to decode each message to get the address for the next one which would lead to numerous requests to a node to get all the transactions from the tangle.
In addition, because the payloads of the messages are encrypted viewing the payloads directly from the tangle is very opaque.
While addressing some of our issues, MAM introduces some other hurdles.
Storage in the Tangle — Database
The two previous approaches highlighted the features that we needed to address if we were going to find a storage solution.
- Fast method for reaching current tracker state
- Complete audit history of data
- Guaranteed providence of data
- Transparency of data on the tangle
Going back to the original concept, which in essence is a database but stored on the Tangle, the following solution based on database model was devised.
A database has an index which points to the records within a table, we can use this model and create an index which points at the transaction hashes for our tracker data. We store the index on the tangle, and remember elsewhere in our system the transaction hash of the current index. If we read the hash we can retrieve the current index data and in turn the transactions which represent the current state of the tracker.
If we perform any operation on the data create/update/delete we store a new entry on the tangle and update the index as follows:
- Create will add the new transaction hash into the index
- Updated removes the hash of the old item from the index and adds the replacement transaction hash
- Delete will remove the transaction hash from the index
After any of these operations are performed the new index is saved to the tangle and our current index pointer is stored.
This process facilitates the current tracker state, but to provide the audit history we add some additional information to the index. When we store a new index, we maintain a last index reference to the previous index. With this last index we can walk back through the index changes and see the whole history of the changes.
With the index chain in place we can get the current state and audit data very easily. Since we are always referencing the transaction hashes we created there is no chance of anyone spamming the data, also we only retrieve the transactions we need to and not all those stored on an address.
This should be sufficient to verify the authenticity of the audit data as well, but to make the concept extra secure and provide a guarantee of the data’s veracity we add an RSA-SHA256 signature to the index and transaction data. By signing the data with our private key and providing the public key to the world anyone can verify the data.
Also, since we are not encrypting the transaction payloads exploring them on the tangle gives a completely transparent view of the data inside.
Our final view of the structure stored on the tangle is as follows:
All that we need to start retrieving data is the Current Index hash.
There are many varied ways of storing data on the Tangle, choosing the appropriate method depends on your goals.
Other features of the grant tracker exist which we have not described, but hopefully we have shown how the fundamental underlying storage mechanism is utilised to store data on the Tangle in a fully transparent, auditable and secure way.
A simplified version of the code which demonstrates the above technique can be found on [https://github.com/iotaeco/iota-database-tutorial]