Let’s start with understanding how traditional web or mobile applications interact with storage. Whenever a user logs in to an application, the application gets the user data from the remote storage provider and display it to the user. All complex computation occurs on dedicated servers maintained in the cloud rather than on the client-side the client machine act as a dumb terminal.
Here is an example with two fictional characters Alice and Bob interacting with traditional web/mobile application.
Let’s say we have two users Alice and Bob. Both have client e.g.: wtsapp, Facebook or snapshot. They interact with the provider of the application. These applications are basically running SQL or any other database in the cloud to provide user services.
Whenever Alice wants to interact with Bob using any messaging app. Alice send’s a message to the service provider than the provider delivers it to Bob.
For example, First Alice sends MSG “Hi” to the storage provider. Then the storage provider sends it to Bob. There is a path between Alice →Storage provider → Bob. There is no direct path between Alice ←→ Bob. This results in centralization, provider writes data on behalf of Alice and Bob and it governs how it should be shared. Both Alice and Bob find each other’s messages by querying the centralized server. The provider is always the single source of truth.
Problems with centralized storage
- Read writes are not strictly associated with the user identity. There is no guarantee to Bob that the message he received is actually from Alice or it has tampered. Also, these massive companies are not giving their services for free. They make money off user data by selling user data to advertisers. In this way, they can target potential customers better. In some cases, they’re not even doing it legally: A German court ruled just this month that Facebook has been illegally collecting data in a breach of consumer law.
- User can not choose different storage providers (only application provider chooses a storage provider and where user data goes).
- The user cannot control who sees their data (storage providers can always look into their data).
How Blockstack solves centralized storage issues
In order to give the user the control over their data and strictly associate their data to user identity, Blockstack has provided decentralized storage system (Gaia) and the blockchain naming system (BNS). User can log in to the blockstack app using digital identity provided by the blockchain naming system (BNS). User data will be strongly coupled with the user public key. Applications will read/write data to the Gaia hub on behalf of a user (if and only if the user allows). All user data will be transferred to their Gaia hub. The user Gaia hub can be owned by the user himself or he can use the default storage space provided by blockstack. In blockstack default, a hub is used to store user data encrypted by the user’s public key. In this way storage providers only see data blobs.
Introduction to gaia
Gaia is user owned storage, the user decides who sees, writes into their storage. They can change the storage provider anytime they want. It is a decentralized high-performance storage system built on top of the driver model to support many storage services. With little work developer… can implement storage provider through Gaia for Dropbox, azure, S3 bucket.
Gaia vs IPFS: The main difference between Gaia and IPFS is in Gaia user own their data and control it, however in IPFS we have an open network where your data is placed on different people’s devices.
How blockstack applications store data and how user own his data.
Let’s say now Alice is using a blockstack messaging app. She interacts with the Gaia service of her own coupled with his own public key. Bob has the same application. Both Alice and Bob want to communicate with each other. In order to communicate with each other, there has to be read/write path between Alice Gaia service and Bob Gaia service.
The question arises, how does blockstack applications interact with the Gaia storage and how Gaia provides the user total control?
Both user and storage backend have defined URL. Blockstack application has defined URL paths to user storage. It makes reads and writes from different storage providers depending on the user. This ultimately allows the user to control data. LookupPath ultimately allows the user to control and store their data.
How do applications in blockstack perform a lookup into Gaia storage?
It is a three step process:
- Lookup the name in the virtual chain to get (name,hash) pair.
- Resolve username to data (controlled via BNS and Atlas network) to get respective zone file.
- Discover storage backend URI from zonefile and lookup the URI to connect to storage backend.
- Fetch data from gaia service specification.
Application resolves a given username to some data. Let’s say we have a user sidra.id. The application is going to use the block-stack naming system using blockstack.js library. Atlas network to get root file (zone file) that is going to define a lot of information about the name. It is also going to provide a URL where my application data is stored.
Once the application has done lookup for the application root file. Applicants will be able to get more specific data. Let’s say I want to lookup file foo.json. Then the only requirement is to do a normal URL fetch. The final setup is what’s defined in Gaia specification.
How user can change gaia provider:
In this system, if the user wants to change the Gaia provider they are running. They can change. Since user owns their username so they can easily associate different data with their username which allow them to pick different application route. This will ultimately allow them to change how applications perform these lookups. Lookup defines the control of data as long as the user can control lookups they control their data. By requiring an application to perform this multi-step lookup. We put control in the user’s hand because the lookup starts with a user-owned data source.
Since, Gaia is a storage back-end it provides a simple interface. So, an application can write and read from it just like the normal application post, get, put request work. There are three paths defined:
- PUT/store/<public-key-hash>/<file-name> it is use to write data to user gaia service on behalf user by applications
- GET/store/<public-key-hash>/<file-name> reads file from user defined by public key hash.
Writing to Gaia hub
Blockstack applications write on behalf of users, e.g. app will do is call PUT for some data to the Gaia service and provide and authentication header which is just like public key signature challenge text. The Gaia service will verify that this application is actually authorized to write data to the user Gaia.
Reading from gaia hub
- Fetching the zone file and the data.
- Verifying that zonefile hash matched user public key.
Gaia is a storage backend for block stack applications. It provides the user ability to own their data, however the data stored is still dependent on traditional DNS services and sophisticated cloud storage’s. It also puts the computational load on the user which most user devices can’t handle. Catchy slogans for a decentralized internet, user privacy has a long way to go. Mostly user’s are not ready for this kind of change. Decentralized applications need to be user-friendly and make others do less work in order to succeed.