Our hackathon success story
Disclaimer: This is a redacted and slightly dramatized version of our Devpost submission that you can find here.
A few weeks ago we published our vision of a DLT data notary service for an IoT data stream. More-less at the same time, we got invited to participate in the Diffusion 2019 Hackathon in Berlin. It was time to put the money where the mouth is and show that our data notary concept was more than just a catchy idea. We decided to build it over the weekend during a hackathon organized by Outlier Ventures in an amazing Factory Berlin Görlitzer Park venue.
We defined our product narrative around the issue of data trust: information is power and data is the new oil. Whoever controls data or has the power and resources to manipulate it, can influence entire countries. Similarly, content can be maliciously doctored or manufactured all together (surely deepfake rings a bell, right?).
Detailed information about the origin of data is becoming vitally important to our societies. Our submission was all about bringing back trust in the data and giving users the power to validate the source of the content they consume. We also wanted to bridge the gap between a fully decentralized web and the current centralized infrastructure of Web 2.0.
So what does our product do?
You all heard the story: once on the distributed ledger, information is considered secure, immutable, tamper-resistant and permanent. Our ambition was to seamlessly extend this trust to include sensors, gateways and other devices provisioning data, and do that with little to none changes to existing data pipelines and infrastructure. At the same time, we wanted to abstract the DLT layer from an end-user, since the majority of the population needs a transition period to get acquainted with the decentralized web.
In our long term vision, Reksio creates a trust layer on top of your existing data layer. It’s a data notarization service that is both DLT- and storage-agnostic, specifically designed for streaming data: video and IoT sensor measurement stream. The trust layer is used to verify that the data we own has not been modified, serving as proof to ourselves and third party consumers. An example scenario is a video surveillance maintainer providing an extra means of security, a court looking to authenticate evidence or a media agency looking to tamper-proof video assets.
How we build it?
Our goal was to create an intuitive, low-cost, non-intrusive MVP capable of efficiently tackling high velocity, high volume video and data stream and storage inefficiency of the Ethereum Network.
For Diffusion 2019 purposes our team decided to set up a version handling the notarization of a live video feed only. We started with a very simple landscape simulating a status quo situation using the following components:
- RTSP server (based on GStreamer) producing a video stream. It was deployed on a Raspberry Pi Zero with a 1080p camera attached.
- RTSP client (based on openRTSP) receiving the data, generating 30-second long
- S3-compatible storage (we used Minio) to host the client output objects.
Pluggable data notary
Reksio is an agent that can be seamlessly plugged into an existing architecture like the one above. It is positioned between the RTSP client and storage, calculates hashes of the data on-the-fly and commits the hashes to the blockchain to effectively create a permanent data seal of authenticity.
The following core services are part of the implementation:
- Secretary is a proxy agent that (a) intercepts the RTSP client output files, (b) calculates hashes for each file, (c) sends the files to the S3 storage and (d) sends the calculated hashes, including file name, to the Notary service.
- Notary is an agent communicating with the DLT for (a) storing the filenames and their corresponding hashes and (b) retrieving hashes from the DLT.
- Frontend to facilitate configuration and provide UI with dashboards.
We wanted the user experience to be as simple as possible. Plugging Reksio into an existing landscape boils down to (a) changing the S3 endpoint of the RTSP client to the one exposed by Reksio, (b) configuring the original target S3 device in Reksio, and (c) choosing the desired DLT for notarization purposes. Simple and effective.
Hardware-based security layer
Besides UX, our solution had a very strong focus on security. We decided to utilize Infineon Blockchain Security 2 Go smart card hardware wallets for transactions signing.
In the PoC the reader was connected to the single-board computer (SBC) which in turn was running the Notary service. With the card atop of the reader, Reksio was a full-blown data notary. With the card removed, the video stream was still pushed to the target S3 but without hash signatures stored on the chain.
From the get-go, our goal was to create a solution that is not only secure but visually appealing. We had only two days but nevertheless, we’ve managed to create a simple, clean and intuitive application.
During an initial install, the user is guided through a configuration process. The configuration interface is non-intimidating and focused on one simple task at a time.
Once the application is up and running, the list of files and the most relevant metadata is displayed on a well-designed, clean interface.
Challenges we ran into
S3 interface. We underestimated the complexity of the AWS S3 interface. Initially, we wanted to use the Secretary agent to intercept AWS S3 request in order to (a) send it 1:1 to AWS S3 for storage and (b) process the file being sent for the notarization purposes. This turned out impossible due to the authorization mechanism in AWS S3. Due to time limitations, this still remains still partially unresolved. As of now, we read a client request, proxy the request to S3 but we do not proxy the S3 response back to the caller.
Additional network traffic. The incoming data stream is now directed to the Secretary, which in turn sends it to the storage, effectively increasing the number of transmitted packages. However, depending on backend technologies, this can be optimized in the future, e.g., by using storage hooks and hashes provided by the data persistence solution.
What’s next for Reksio
Solve technical challenges
Right now Reksio is just code and off-the-shelf devices hacked together. On the hardware side, we would like to get from this:
Software-wise there’s also quite a lot of work ahead of us:
Optimize the notarization architecture. Reduction of bandwidth, ledger storage cost, and computational complexity could be reduced by improving system architecture and component integration.
Full chain of trust and hardware security. Digitally signing data at the source using secure elements to provide an end-to-end, unbroken chain of trust between the data source and its persistence. This will allow not only means to prove that the data in the storage has not been altered, but also no modification had not been possible during transmission.
Decreased hashing latency. By increasing buffering time, we limit the time when an attacker could modify the data before the hash is calculated.
Lower notarization cost. Lower latency results in a higher notarization hashrate, which can be reduced by using hash trees. The notarization cost for our MVP is calculated to be 0.0103 ETH/h (1.58 €/h) per video stream. This can be reduced by trading off temporal granularity.
Cross-stream hashing. In order to scale to the exploding IoT data sources, lower volume and velocity data streams will be aggregated and jointly signed.
Data consumption component. Our MVP does not facilitate an interface to consume the notarized data. Based on future customer experience we would gradually extend our product to cover the entire life cycle of the collected data, extending the chain of trust to the consumption by the end-user. The authentication and authorization mechanism could be based on DID like Sovrin.
Support for additional DLTs. By design, our product is DLT agnostic, but support for additional chains is still to be implemented.
Looking back at the event we tried to pinpoint the key success factors that helped us win all those awards.
Daily rituals. We have an established workflow process and a framework we use for both: the development process as well as team management. We work independently but at the same time as a part of a team. We exchange opinions and are able to quickly adapt to unexpected situations.
Practiced routine. At the time of the hackathon, only two coders in our team knew about the inner-workings of the DLT-ecosystem but we knew exactly how to divide and conquer. Within minutes we delegated the responsibilities and we knew who has what part to play. What might have looked like chaos was, in fact, a beautifully executed turquoise concerto.
Mentoring. We love sharing our knowledge and helping junior colleagues mature with us. If you’re talented and a quick learner, and you bring some foundations we can build together atop of, we’ll help you become great.
No-diss culture. Don’t know how to approach a problem or solve a task? Just ask. Our team embraces asking if there’s something that we don’t know. We follow the there are no stupid questions rule. There are only things you know and things you don’t. Everyone is unique, but together we work as one organism.
A mix of a mature stack and modern tech. We build using battle-tested frameworks but we like to mix in the tools we’re just about to get to know. Reksio was built with Java, Python, Solidity on a dockerized landscape. We had to learn a lot about Hardware Security and image processing though.
Dust ist slowly settling down, but the momentum is strong. We’re already looking into potential partnerships that can help with turning a good MVP into a great product. If you feel like talking about it, drop us a line in the comments.