Discovering GeoDB 6. Interconnection
“It isn’t that they can’t see the solution. It is that they can’t see the problem”
G.K. Chesterton, 1935 
We’re here one more week with a new discovering GeoDB blog-post, the sixth one already. We’re reaching the final stage of this trip, and we’ve left the best for the end. Before we start talking about our interconnection model, let’s remember what we’ve talked about in our previous posts:
- In , The power of place, we’ve analyzed the great value of private location data.
- In , Game theory, we’ve reviewed how in a free and competitive market, the price will be dictated by both sellers and buyers.
- In , Blockchain 101, we’ve summarized the pillars of blockchain technology and we’ve indicated how we aspire to use it to work with private location data.
- In , Modular Blockchain Architectures, we’ve explained our concept of a modular blockchain architecture and why we believe that the interconnection between blockchains and therefore, this type of architecture, will be the predominant in the coming years.
- In , Measuring size and cost, we’ve shown the results of some studies that we’ve carried out about the estimated size of our big data and about the cost of storing this information using blockchain technology.
In our last post we talked about a hybrid approach in which we propose to store certain information using blockchain technology and other information in a public blockchain in order to minimize costs and maximize security. In our opinion, this scheme offers an optimal solution for several key aspects of our domain such as storage cost, scalability, security or immutability among many others.
We can not ignore that currently there are other proposals which, using blockchain technology, aim to reward users for providing their private information, each of them following a different architecture. Therefore, to clearly understand the motivation behind our architectural design, it’s convenient to reflect on the domain in which GeoDB is defined.
GeoDB is a proposal conceived for the commercialization of private locations under a big data paradigm. The big data market has an economic value of billions of dollars ($125.000.000.000 in 2015 ), but we must understand that the big data paradigm is not related to the individualized sale of user data. Our architecture has been conceived to:
- Make available the private location information of million of users . A big data query is not about the behavior of an individual, but of a large number of them.
- Guarantee the integrity and immutability of terabytes of information . Do you know the size of a public blockchain? In the case of Bitcoin, its current size after eight years is 169.12 GB . The amount of location information generated by only 10.000.000 users on a daily basis is almost the same .
- Resolve complex queries in this volume of information in order to obtain relevant information.
Due to the above, we believe that the only way to build GeoDB today is under a hybrid architecture. We must be clear that a hybrid solution is not a magic solution, since it’s well known that the interconnection of two isolated components could be even more complex than the creation of the components themselves. In addition, our scenario has an additional problem, we need to write additional information in a public blockchain and this is very expensive .
The question at this point is, is it possible to do this at a reasonable cost? Following a traditional approach, it’s not, but we should emphasize the word traditional. To paraphrase G.K. Chesterton , maybe it isn’t that we can’t see the solution, it’s that we can’t see the real problem.
Our technological stack
Broadly speaking, we can say that we propose a hybrid architecture based on blockchain technologies in which will use:
- An ERC-20 token to manage the economic value of the locations.
- Open source blockchain technology for our infrastructure.
Why an ERC-20 token?
ERC-20 is the de-facto standard for the definition of tokens. It’s a type of token defined in Ethereum blockchain and it is just a coincidence that it is called in this way, where ERC stands for Ethereum Request for Comment, and 20 is the number that was assigned to this request .
Defining a token using this standard is a guarantee for us and for the users of GeoDB, due to the wide use of Ethereum and the existence of multiple popular services adapted to use it .
Nobody should be surprised by the fact that today, more than 100.000 tokens are defined as ERC-20 tokens .
Why open source blockchain technology?
We believe that it is unnecessary to reinvent the wheel. There are proven technological solutions with which we can provide many of the elements that are necessary for the infrastructure that we want to build.
Currently, in CoinMarketCap there are 838 coins associated with full blockchain implementations , and you know what? almost all of them are open source. But blockchain open source solutions do not end here. More and more companies are promoting open source projects for the development of blockchain frameworks with which it is possible to deploy adapted blockchain solutions. Have you heard about Corda , HyperLedger , BigChainDB , OpenChain  or MultiChain ? If you don’t know any of them, be prepared to experience the Baader-Meinhof phenomenon .
Deploy a smart contract to define an ERC-20 token is something extremely simple . Deploy a smart contract without security problems is a bit more complicated . Before using smart contracts in Ethereum it’s necessary to consider two critical points:
- Writing in ethereum is, and always will be, expensive. The storage space in a blockchain like this should be considered as a precious resource, so it’s necessary to establish high prices so that it isn’t wasted.
- The code is law. Once a smart contract has been deployed it can not be modified. You can only modify the behavior of a deployed smart contract if you’ve developed it to allow the desired change. Obviously, any poorly designed mechanism opens a security gap that can be exploited by third parties and no one can stop them .
One of the main characteristics of ethereum smart contracts is that they do not allow the execution of non-deterministic code. Among many other things, this implies that only information that exists in ethereum can be used when they’re executed.
Taking into consideration the above and focusing on our proposal, the scenario sets as follows:
- We use an ERC-20 token that makes it possible to fairly reward the participants.
- The rewards are assigned based on the participation in the GeoDB big data ledger.
- To assign the rewards in ethereum, it’s necessary to transfer the necessary information from the GeoDB ledger to ethereum blockchain. For this it’s necessary to consider that:
- Ethereum storage is expensive, so the amount of data to write must be minimal.
- The transaction that triggers the reward assignment must be secure. Otherwise, anyone (users, nodes or any others), could request reward without being worthy of it.
How can we combine all this? Our approach is to use a paradigm that we call request-approval-justice.
In first place, a smart contract will be deployed in ethereum to define an ERC-20 token. This contract will be designed to allow:
- Set the rules that regulate the supply.
- The assignment of tokens to specific addresses.
- Reclaim tokens for a given address.
Another smart contract will be deployed in GeoDB big data ledger to account the tokens that can be claimed by each participant based on their participation. Under this paradigm, each address will have two token balances, i) tokens in ethereum that we can transfer and ii) claimable tokens in GeoDB big data ledger that we can not move but that we can claim to be assigned to us in ethereum.
Additionally, there will be a set of nodes that, using a PoS  mechanism to guarantee their correct behavior, will write the resumes of the blocks into ethereum blockchain when the blocks have reached a given depth. These resumes will be used as references for the execution of ethereum smart contracts as we will explain later.
Storing a resume in ethereum currently costs slightly less than half a dollar. Considering that 288 blocks are created daily if a block is created every five minutes, the cost of storing the resumes of all the blocks would be $144 per day or $52.260 per year. However, this would be the cost of storing all the resumes, which is not absolutely necessary nor saffer. For example, by storing a resume every hour, the daily cost would be $12 or $4.380 per year.
As we’ll see, to apply ‘justice’ the tokens are not rewarded at the moment, so the number of resumes to be saved could be only one every several days without any problem.
So far we have:
- A smart contract in ethereum to assign and move tokens.
- A smart contact in GeoDB big data ledger to assign rewards and reclaim them.
- A set of nodes using a PoS mechanism to transfer GeoDB big data ledger blocks’ resumes to ethereum.
In this infrastructure, the first phase, Request, can be executed. In a request, a user claims the corresponding tokens in ethereum using the GeoDB ledger. A node that acts as a notary will verify that everything is correct and in that case, it’ll generate a signed transaction in GeoDB ledger that proves this. When the next resume of GeoDB ledger is transferred to ethereum, the node generates a new transaction in ethereum specifying:
- The amount of tokens to be transferred to a given address.
- The hash of the transaction in GeoDB ledger that proves that those tokens must be transferred to the address (1).
- The first GeoDB’s block resume stored in ethereum network which is generated after (2).
Approval is not a phase but a state in the process. While request involves the creation of several transactions, specifically 2 in GeoDB ledger, claim and verification, and 2 in ethereum, block’s resume and reward order, an approved transaction is only a reward order made from an account with sufficient balance for the justice phase.
To apply justice we follow the presumption of innocence, i.e, everyone is innocent unless proven guilty. Under this approach, every approved order transaction is automatically considered valid once a possible crime has prescribed. The appropriate time in which a possible crime has been prescribed will be analyzed and fixed before the launch of our mainnet.
During the justice phase, anyone can check if any approved transaction is legitimate or not, for which it’s enough to check if the verification transaction, which is signed by the node that acted as notary, is correct or not. The proof of an illegitimate order transfers the notary’s deposit to the accusation and blocks the assignment of the tokens.
We still have to carry out several experiments to refine and optimize our paradigm, but in general terms we can see how it allows us to minimize the interconnection costs. As it was initially indicated, to find the solution sometimes it’s only necessary to rethink the problem. In our case, the problem is not how to communicate, but what is communicated and who makes the communication.
In our next post we’ll present our ERC-20 token, the GEO.