RChain cutting into the database circuit?

Darryl Neudorf
RChain-China
Published in
6 min readNov 13, 2020

August 11, 2020 — Originally published in Chinese.

Author: Dimworm

https://zhuanlan.zhihu.com/p/157127721

In 1970 mathematician Edgar Frank Codd published his paper, “A Relational Model for Large Shared Database Data,” which introduced the relational model. The relational model decoupled query and data storage, freeing programmers to focus on query statements without caring about the format in which the data was stored, which greatly freed up programmer productivity. The following decades relational databases and the corresponding SQL language ruled the era of stand-alone databases.

The Internet explosion of 2000 brought a transformative force to database technology. To cope with the exponential growth in the number of users, the Internet giants had to “shatter” their databases, storing data on multiple hosts, and sharding was a viable solution to the scalability problem, using the well-known Impossible Triangle, or CAP principle (Consistency, Availability). Availability, Partition Tolerance) explanation, is to give up consistency in order to solve the scalability problem. In such a context, NoSQL has become a trend, NoSQL gives up the relational model, give up the transaction, give up the index, give up the join, give up the data consistency, just to solve the scalability problem.

NoSQL includes four categories: key-value database, column family database, document database and graph database. The key-value database is a key-value table, which is suitable for a large number of write operations. Key-value databases are inherently scalable and can theoretically be expanded horizontally to achieve unlimited capacity.

Since then, database technology has two options: the choice of sql database, we must give up scalability; the choice of NoSQL database, we must give up transactions and strong consistency. 2010, people are tired of the choice of SQL or NoSQL, people are asking: can not be both NoSQL scalability and to SQL transactions and strong consistency? It can be! NewSQL is a SQL database that can run in a distributed environment, so it can also be called distributed SQL.

NewSQL wants to pick up everything that NoSQL gave up, SQL is good, I want it! Piece by piece for horizontal scalability, I want it! Data synchronization mechanism based on Raft or Paxos consensus protocols, I want it! Read-write lock or MVCC-based transaction atomicity, I want it!

When NewSQL has all these weapons at its disposal, have you noticed that it’s becoming more and more like a blockchain? Are blockchain, which focuses on security, and database, which focuses on high performance, becoming more and more alike? The two tracks are getting closer and closer together. Let’s look at it from a different angle, many blockchain projects are actually trying along the same direction. For example, BigChainDB aims to be a decentralized database that uses the Tendermint consensus protocol and integrates a set of MongoDB instances for storage. Catena, for example, embeds a SQL database into the POW blockchain, with the goal of allowing you to do transactions on the blockchain using SQL statements. But these attempts were elementary, and mechanical additions couldn’t lead to substantial performance breakthroughs.

Now, it’s time for RChain to change the stereotype of the blockchain. People will find that blockchain can be a better database.What makes RChain different from other blockchains is that instead of following the blockchain architecture in general, it starts with a completely new data storage architecture and redesigns the blockchain.RChain can be understood as RSpace storage layer + Rholang contract layer + Comms network layer + CBC- Casper Consensus Layer. A distributed, fault-tolerant data storage scheme.

RSpace Storage Layer.

It is based on the Scala implementation of Tuplespace, which was developed by Professor Gelernter at Yale University. Tuplespace is a shared-memory paradigm for distributed computing systems, proposed by Professor Gelernter of Yale University. Existence.

  1. Rholang contractual layer

Rholang is the smart contract language on RChain, and Rholang is also well known. It started with the Turing Award winner Robin Milner’s proposal of Pi arithmetic, which is the equivalent of lambda arithmetic in serial computing, in distributed computing. Whereas lambda arithmetic says that everything is a function, Pi arithmetic says that everything is a process, and lambda arithmetic says that a computation is a call between functions, Pi arithmetic says that a computation is a channel through which processes communicate with each other.

Greg, the founder of RChain, added reflexivity to Pi and came up with Rho, the cornerstone of the Rholang language, which is powerful because it converts code to data, data to code, and code to pass back and forth between processes.

Rholang’s built-in pattern matching and behavior type system make it a very imaginative query language, and Rholang can correspond semantics directly to the stored semantics of RSpace. This is the heart of what CBC is all about: build-as-correct, ensuring that there is no possibility of SQL injection attacks.

SQL also supports concurrency, but it’s hard to control SQL concurrency; there’s nothing you can do about it but lock it. Another similarity between Rholang and SQL is that Rholang’s comm rule supports transaction mechanism. A transaction is a function that operates on multiple statements as a whole. If the transaction fails, no changes are made to the database data. Transactions have ACID characteristics: (Atomicity, Consistency, Isolation, Durability).

  1. Comms Network Layer + CBC Capser Consensus Layer

RChain’s Comms network layer uses gRPC as the transport layer, using the Kademlia protocol for node discovery. rchain’s consensus layer is the famous CBC Casper. cbc Capser is a Byzantine fault-tolerant protocol that can deal with naughty nodes. The Raft and other protocols used by NewSQL are CFT crash-tolerant protocols, which can only deal with errors caused by equipment or network failures, and can do nothing about evil nodes.

In summary, you can see that RChain is a combination of NoSQL and NewSQL respective advantages, both swords together a new generation of storage architecture. Not only has the NoSQL key-value storage structure to achieve unlimited horizontal expansion, and configuration and SQL can be compared to a structured query language and a more robust consensus protocol based transaction atomicity.

Is RChain a database disguised as a blockchain? No, RChain is a much more forward-thinking blockchain. At some point in the future, the two tracks — blockchain and database — will converge, and it will be then that the elegance of RChain’s architecture will be deeply understood.

Talk is cheap, show me the code.

RSpace, which has been wrapped into the Scala library:

RSpace is part of the RChain codebase and can be used on its own. Instructions for use: rchain/rchaingithub.com

  1. god Dan Connolly’s Zulip backend rewrite proves the feasibility of doing SQL operations on RChain. Daedalus changed the backend of the chat software to RChain, and the chat messages go straight up as if they were written to a SQL database: rchain-community/rv2020github.com

Welcome to the “Rholang Chinese Community” public website.

Developers are welcome to join “RChain Developers” WeChat group. Please add lowbird WeChat to join the group. Non-developers please do not join, there will be a round of testing, and only those who pass will be admitted.

RChain切入数据库赛道?

愁虫

1970年数学家Edgar Frank Codd发表了论文《大型共享数据库数据的关系模型》,提出了关系模型。关系模型解耦了查询和数据存储,让程序员无需关心数据的存储格式,而专注于查询语句,这极大地解放了程序员的生产力。随后几十年关系数据库和对应的SQL语言统治了单机数据库时代。

2000年的互联网大爆发,给数据库技术带来了变革的力量。各互联网巨头为了应对用户数的指数级增长,不得不“打碎”数据库,将数据存放到多个主机上。Sharding是解决扩展性难题的可行方法,用大家熟知的不可能三角,即CAP原理(一致性Consistency,可用性Availability,分区容忍性Partition Tolerance)解释,就是为了解决可扩展性难题,要放弃一致性。在这样的背景下, NoSQL成为一股思潮,NoSQL放弃了关系模型,放弃了事务,放弃了索引,放弃了join,放弃了数据一致性,只求解决可扩展性难题。

NoSQL包括四类:键值数据库、列族数据库、文档数据库和图数据库。以键值数据库为例, 键值数据库看成一个key-value表格,比较适合大量写操作的场景。键值数据库是天生具有良好的伸缩性,理论上可以通过横向扩展实现无限扩容。

从那时起,数据库技术有了两个选项:选用sql数据库,就要放弃可扩展性;选用NoSQL数据库,就要放弃事务和强一致性。2010年前后,人们厌倦了SQL or NoSQL的选择题,人们在问:难道就不能既要NoSQL的可扩展性又要SQL的事务处理能力和强一致性吗?可以的!新思潮NewSQL出现了,NewSQL指可以运行在分布式环境下的SQL数据库,所以也可称之为分布式SQL。

NewSQL希望把NoSQL放弃的东西全都捡回来,SQL是好东西,我要!分片实现横向可扩展性,我要!基于Raft 或Paxos共识协议的数据同步机制,我要!基于读写锁或MVCC的事务原子性,我要!

当NewSQL把这些武器备齐了,有没有发现它越来越像区块链了?主打安全性的区块链和主打高性能的数据库越来越像了?两条赛道越靠越近了。我们换个角度看,很多区块链项目其实也在沿着同样的方向尝试。比如BigChainDB 旨在成为一个去中心化的数据库,它采用Tendermint共识协议,集成一组MongoDB实例来做存储。比如Catena 将SQL数据库植入POW区块链,目的是让你可以使用SQL语句在区块链上做事务。但是这些尝试都很初级,机械的加法无法带来实质性的性能突破。

现在,到了RChain来改变人们对区块链刻板印象的时间。人们会发现区块链可以是更好的数据库。RChain不同于其它区块链的地方是,它没有沿袭一般意义上的区块链架构,而是从全新的数据存储架构入手,重新设计了区块链。RChain可以理解为RSpace存储层 + Rholang合约层 + Comms网络层 + CBC-Casper共识层。 形成一个分布式的容错的数据存储方案。

  1. RSpace存储层

RSpace系出名门,它是基于Tuplespace的Scala实现。 Tuplespace由耶鲁大学的Gelernter教授提出,在分布式计算体系中属于共享内存的范式。Tuplespace可以想象成一张表格,存储着key-value结构的数据,在Tuplespace中,多个进程异步通信时,不需要相互传递消息,甚至不需要知道对方的存在。

2. Rholang合约层

Rholang是RChain上的智能合约语言,Rholang也系出名门。这得从图灵奖得主Robin Milner提出的Pi演算说起,Pi演算在分布式计算中的地位等同于 λ演算在串行计算中的地位。 λ演算认为所有的东西都是函数,而Pi演算认为所有的东西都是进程。λ演算认为所谓计算就是函数之间的调用,而Pi演算认为所谓计算就是进程之间通过通道(channel)相互通信。

RChain的创始人Greg在Pi 演算的基础上加上了反射性,提出了Rho演算。Rho演算是Rholang语言的基石。Rho演算强悍的地方在于它可以将代码转换为数据,将数据转换为代码,代码可以在进程之间来回传递。

Rholang内置的模式匹配和行为类型系统让它成为一个非常有想象力的查询语言,Rholang可以直接将语义对应到RSpace的存储语义。这就是CBC的核心内涵:构建即正确,确保了没有SQL注入攻击的可能性。

SQL也支持并发,但是你很难对SQL的并发进行控制,除了锁,没有其它办法。而Rholang在语义层面提供并发执行的能力。Rholang另一点和SQL相似之处是Rholang的comm rule支持事务机制。所谓事务即把多条语句作为一个整体进行操作的功能。如果事务失败,不会对数据库数据有任何改动。事务具有ACID特性:(原子性Atomicity、一致性Consistency、隔离性Isolation、持久性Durability)

3. Comms网络层 + CBC Capser 共识层

RChain的Comms网络层采用gRPC做传输层,采用Kademlia协议做节点发现。RChain的共识层是著名的CBC Casper。CBC Capser是拜占庭容错协议,可以对付作恶节点。而NewSQL使用的Raft等协议是CFT崩溃容错协议,只能应对设备或网络故障导致的出错,对作恶节点无能为力。

综上,可以看出RChain是结合了NoSQL和NewSQL各自优势,双剑合璧的新一代存储架构。不但拥有NoSQL的key-value存储结构实现了无限横向扩展,而且配置了可以和SQL相比拟的结构化查询语言以及更强健的基于共识协议的事务原子性。

RChain是伪装成区块链的数据库吗?不是,RChain是一个更超前的区块链。在未来的某个时点上,区块链和数据库这两条赛道将交汇,而那时,人们才会深刻理解到RChain架构的优雅。

Talk is cheap, show me the code。

  1. 已经封装成Scala库的RSpace:

RSpace是RChain代码库的一部分,可以单独使用。使用说明:

rchain/rchain​github.com

2. 大神Dan Connolly的Zulip后端改写,证明了在RChain上进行SQL操作的可行性。大神将聊天软件的后端改为RChain,聊天信息直接上链,如同写入SQL数据库一样:

rchain-community/rv2020​github.com

欢迎关注 “Rholang中文社区” 公众号

欢迎开发者加入 “RChain开发者” 微信群。加群请加lowbird微信,拉你入群。非开发者请勿加,会有一轮测试,通过者方可入群。

发布于 07–05

区块链(Blockchain)

数据库

NoSQL

​赞同​​添加评论

​分享

​喜欢​收藏

赞同

分享

文章被以下专栏收录

--

--