Moving ahead
When we decided to move into Blockchain with Hubii Network, it was clear to us that we wanted to work with and help grow the Blockchain ecosystem by working with other companies offering Blockchain-based solutions. We see working with partners on different parts of the value chain as a great way for us to contribute to innovation there.
We are just two days away from our ICO but full steam working with this approach. First up: data storage.
Our data storage environment is heterogeneous, meaning we use MongoDB, SolR, Redis, Hadoop and for the CDN (Content Delivery Network) we use Rackspace. Testing a Blockchain based data storage solution activity seemed like the right first step, and so we did.
Considering our existing technology stack is not exempt of micro services and years of code running in a production environment we analysed candidates for the benchmarking test.
On a daily basis we process, approximately, 30Gb of raw data through a series of micro services; it is what we call the article pipeline. A key component is our image crawler which helps us, among other things, to process images and store them within the CDN.
A CDN solution is paramount to any company that wishes to serve large quantities of static data. Our experience over the years with our CDN provider made us consider testing Storj’s solution as we aim at migrating our static data storage solution immediately post-ICO. We currently provide service to 50m users through our distribution partnerships and aim at reaching out to 800m daily users within the next 12 months.
Thus, the last few days we have been running two instances of our image crawler in a production environment to be able to benchmark both solutions; one using Rackspace’s API and one running Storj’s API. We have also run isolated tests in different hosting environments: Hetzner and AWS.
We see strong data points that make us explore this approach further and we consider migrating to Storj post-ICO. We feel the solution is more mature than originally expected. We did not identify any major performance degradation over time nor concurrency.
In short — a very promising exercise. And we look forward to exploring more blockchain solutions as we roll our Hubii Network.
The results are as follows (numbers are in bytes);
Rackspaceupload
Hetzner
32892720 uploaded in 7570ms
32892720 uploaded in 6885ms
32892720 uploaded in 16892ms
AWS
32892720 uploaded in 6347ms
32892720 uploaded in 5770ms
32892720 uploaded in 6341msdownload
Hetzner
32892720 downloaded in 3766ms
32892720 downloaded in 4937ms
32892720 downloaded in 4587ms
AWS
32892720 downloaded in 6436ms
32892720 downloaded in 6082ms
32892720 downloaded in 11333ms
Storjupload
Hetzner
32892720 uploaded in 8399ms
32892720 uploaded in 33399ms
32892720 uploaded in 7853ms
AWS
32892720 uploaded in 8377ms
32892720 uploaded in 8089ms
32892720 uploaded in 9914msdownload
Hetzner
32892720 downloaded in 25331ms
32892720 downloaded in 5405ms
32892720 downloaded in 22785ms
AWS
32892720 downloaded in 21427ms
32892720 downloaded in 18671ms
32892720 downloaded in 20807ms
Alterations in Storj upload concurrency:
upload
AWS
32892720 uploaded in 7463ms (concurrency = 1)
32892720 uploaded in 7773ms (concurrency = 6)
32892720 uploaded in 7688ms (concurrency = 12)
We tried uploads and downloads from both Hetzner and AWS, each time a count of 3 in order to get estimates of fluctuations. The file size is in bytes.
Also note the latter block where we tried to look at what impact Storj upload concurrency could have. On the fairly small data files that we used it had not noticeable effect. For the Storj uploads where nothing is noted with regards to concurrency we used the value of 6.
Furthermore;
Storj
- The API documentation, https://storj.readme.io/docs, requires more work and more specifically at https://storj.readme.io/docs/working-with-the-storj-api-bridge.
- Vibrant community at https://storj.io/community.html using Rocket Chat (Slack like) interface. Rapid answers. Paramount for any developer.
- For our purpose we use http://storj.github.io/core/; we believe more recently renamed storj-lib (at least for the npm packages). Great interface that we found more intuitive than pkgcloud used for Rackspace.
Rackspace
- Well documented API at https://developer.rackspace.com/ and more specifically for our use of file storage at https://developer.rackspace.com/docs/cloud-files/v1/
- Community forum at https://community.rackspace.com/developers/default. It seems fairly good. Having said that we see a significant number of posts unfortunately remain unanswered.
- For our purpose we use https://github.com/pkgcloud/pkgcloud with Node.js. Decent interface that has the benefit of standardising (more or less) with multiple cloud storage providers (Amazon, Azure, Google, HP, Openstack, Rackspace).