Intro to CKB Script Programming 2: Script Basics

Nervos Network

Published in

Nervos Network

15 min readAug 14, 2019

July 2019 · 19 minute read

by Xuejie Xiao, a blockchain engineer working on Nervos CKB and the main developer on CKB VM.

Our last post introduced CKB’s current validation model. In this post, we will show how to deploy script codes to CKB for real. After reading this post, you will be able to explore CKB and work on new script codes.

Note: CKB development is still happening, so there will be updates. I will try my best to make sure this post is updated, but for reference, this post is describing CKB as of this commit.

This will be a long post. I’ve tried to split it into individual sections to make it a bit easier read.

Wording

Before we continue, lets distinguish between 2 terms: script, and script code.

Script code actually refers to the program you write and compile to use on CKB. Script, on the other hand, actually refers to the script data structure used in CKB, which is a little more than just the script code:

pub struct Script {
    pub args: Vec<Bytes>,
    pub code_hash: H256,
    pub hash_type: ScriptHashType,
}

We can ignore hash_type for now, a future post will explain what it is and its usage. Later in this post, we will show that code_hash actually just identifies a script code, so for now we can just think of it as script code. What script also includes, is the args part, which distinguishes script from script code. args can be used here to provide additional arguments for a CKB script. For example, while people might all be using the same default lock script code, each of them might have their own pubkey hash, and args are the place to hold pubkey hash. This way each user of CKB can have a different lock script, while sharing the same lock script code.

Note that in most cases, script and script code can be used interchangeably, but it’s worthwhile to recognize of the difference between the two.

With CKB there’s endless possibilities

As you might have already heard, CKB is based on the open source RISC-V ISA. But what does that really mean? It means we are (sort of) embedding a real mini computer in CKB, instead of a virtual machine. The benefit of a real computer is that you can write any logic you want in any language you want. The first few examples we show here will be written in C for simplicity (meaning simplicity in the toolchain, not the language) but later we will switch to JavaScript based code, and hopefully show more languages in future posts.

As we mentioned, CKB VM is more like a mini computer. CKB script code also looks like a normal Unix style executable program run on a computer:

int main(int argc, char* argv[])
{
  return 0;
}

When compiled via a C compiler, this will become a script code that can run on CKB. In other words, CKB just takes Unix style executables —but in RISC-V architecture instead of the popular x86 architecture— and runs it in a virtual machine environment. If the program returns with 0 as the return code, we consider the script successful. All non-zero return codes will be considered script failures.

In the example above, we are showing a successful script code since the return code will always be zero. Please don’t use this as your lock script code, as your token can then be taken by anyone.

To make the example above a bit more interesting, let’s talk about carrots. I dislike carrot. I know they’re great from a nutritional point of view, but I still don’t like them. Now what if I want to set a rule that none of the cells on CKB has data that begin with carrot?

In order to ensure none of the cells can have carrot in their data, we need a way to first read cell data in the script. CKB provides syscalls to help with this.

To ensure the security of CKB script, each script has to run in an isolated environment that is totally separated from the main computer where you are running CKB. This way it won’t be able to access any data it doesn’t need, such as your private keys or passwords. However, for a script to be useful, there must be specific data it can access, such as the cell that a script guards, or a transaction that a script validates. CKB provides syscalls to ensure this. Syscalls are defined in RISC-V standard— they provide a way to access certain resources provided by the environment. In a normal situation, the environment here means the operating system, but in the case of CKB VM, the environment refers to the actual CKB process. With syscalls, a CKB script can access the whole transaction containing itself, including inputs, outputs, witnesses, and deps.

The good news, is that we have encapsulated syscalls in an easy to use header file. You are welcome to poke around this file to see how syscalls are implemented. The bottomline is you can just grab this header file and use the wrapped functions to make your own syscalls.

Now with the syscalls at hand, we can start with our carrot-forbidden script:

#include <memory.h>
#include "ckb_syscalls.h"int main(int argc, char* argv[]) {
  int ret;
  size_t index = 0;
  volatile uint64_t len = 0; /* (1) */
  unsigned char buffer[6];  while (1) {
    len = 6;
    memset(buffer, 0, 6);
    ret = ckb_load_cell_by_field(buffer, &len, 0, index, CKB_SOURCE_OUTPUT,
                                 CKB_CELL_FIELD_DATA); /* (2) */
    if (ret == CKB_INDEX_OUT_OF_BOUND) {               /* (3) */
      break;
    }    if (memcmp(buffer, "carrot", 6) == 0) {
      return -1;
    }    index++;
  }  return 0;
}

Several points worth explaining here:

Due to C quirks, the len field needs to be marked as volatile. We will use it both as an input and output parameter, and CKB VM only can set the output when it lives in memory. volatileensures a C compiler keeps it as a RISC-V memory based variable.
When making a syscall, we need to provide the following: a buffer to hold the data provided by the syscall; a len field denoting both the buffer length, and available data length returned by the syscall; an offset into the input data buffer, and several parameters denoting the exact field we are fetching in the transaction. For more details, please refer to our RFC.
For maximum flexibility, CKB uses the return value of the syscall to represent data fetching status: 0 (or CKB_SUCCESS) means success, 1(or CKB_INDEX_OUT_OF_BOUND) means you have finished fetching all indices in a kind, 2(or CKB_ITEM_MISSING) means an entity is not present, such as fetching a type script from a cell that doesn’t have one.

To recap, this script would loop through all output cells in the transaction, load the first 6 bytes of each cell data and test if those bytes match carrot. If we found a match, the script would return -1, denoting an error status, if no match is found, the script exits with 0, meaning execution success.

To perform the loop, the script would keep an index variable, in each loop iteration, it tries to make the syscall to fetch the cell denoted by current index value. If the syscall returns CKB_INDEX_OUT_OF_BOUND, it means the script has iterated through all the cells and just exits the loop, otherwise, the loop would continue, the cell data is tested, then index variable is incremented for the next iteration.

This concludes your first useful CKB script code! In the next section, we will see how we can deploy it to CKB and run it.

Deploying a script to CKB

First, we need to compile the carrot source code written above. Since GCC already has upstream RISC-V support, you can of course use the official GCC to build your script code. Or you can use the docker image we have prepared to save the trouble of compiling GCC:

$ ls
carrot.c  ckb_consts.h  ckb_syscalls.h
$ sudo docker run --rm -it -v `pwd`:/code nervos/ckb-riscv-gnu-toolchain:xenial bash
root@dc2c0c209dcd:/# cd /code
root@dc2c0c209dcd:/code# riscv64-unknown-elf-gcc -Os carrot.c -o carrot
root@dc2c0c209dcd:/code# exit
exit
$ ls
carrot*  carrot.c  ckb_consts.h  ckb_syscalls.h

And that’s it—CKB can use the compiled executable from GCC directly as scripts on chain. We can now deploy it. Note that I will use CKB’s Ruby SDK since I used to be a Ruby programmer and Ruby feels the most natural for me, but may not be your choice. Please refer to the official README for how to set it up.

To deploy the script to CKB, we can just create a new cell, with the script code as cell data part:

pry(main)> data = File.read("carrot")
pry(main)> data.bytesize
=> 6864
pry(main)> carrot_tx_hash = wallet.send_capacity(wallet.address, CKB::Utils.byte_to_shannon(8000), CKB::Utils.bin_to_hex(data))

Here I simply create a new cell with enough capacity by sending tokens to myself. Now we can create the type script containing the carrot script code:

pry(main)> carrot_data_hash = CKB::Blake2b.hexdigest(data)
pry(main)> carrot_type_script = CKB::Types::Script.new(code_hash: carrot_data_hash, args: [])

Recall the Script data structure:

pub struct Script {
    pub args: Vec<Bytes>,
    pub code_hash: H256,
    pub hash_type: ScriptHashType,
}

We can see that instead of embedding the script code directly in the script data structure, we are only including the code hash, which is a Blake2b hash of the actual script binary code. Since carrot script doesn’t use an argument, we can use empty array for args part.

Note I’m still ignoring hash_type here. In a future post, we’ll explore a different way of specifying code hash. For now, let’s keep it simple.

To run the carrot script, we need to create a new transaction, and set carrot type script as the type script of one of the output cells:

pry(main)> tx = wallet.generate_tx(wallet2.address, CKB::Utils.byte_to_shannon(200))
pry(main)> tx.outputs[0].instance_variable_set(:@type, carrot_type_script.dup)

There’s one more step needed: in order for CKB to locate the carrot script, we need to reference the cell containing carrot script in one of transaction deps:

pry(main)> carrot_out_point = CKB::Types::OutPoint.new(cell: CKB::Types::CellOutPoint.new(tx_hash: carrot_tx_hash, index: 0))
pry(main)> tx.deps.push(carrot_out_point.dup)

Now we are ready to sign and send the transaction:

[44] pry(main)> tx.witnesses[0].data.clear
[46] pry(main)> tx = tx.sign(wallet.key, api.compute_transaction_hash(tx))
[19] pry(main)> api.send_transaction(tx)
=> "0xd7b0fea7c1527cde27cc4e7a2e055e494690a384db14cc35cd2e51ec6f078163"

Since this transaction does not have any cells containing carrot in the cell data, the type script validates successfully. Now let’s try a different transaction that does have a cell that begins with carrot:

pry(main)> tx2 = wallet.generate_tx(wallet2.address, CKB::Utils.byte_to_shannon(200))
pry(main)> tx2.deps.push(carrot_out_point.dup)
pry(main)> tx2.outputs[0].instance_variable_set(:@type, carrot_type_script.dup)
pry(main)> tx2.outputs[0].instance_variable_set(:@data, CKB::Utils.bin_to_hex("carrot123"))
pry(main)> tx2.witnesses[0].data.clear
pry(main)> tx2 = tx2.sign(wallet.key, api.compute_transaction_hash(tx2))
pry(main)> api.send_transaction(tx2)
CKB::RPCError: jsonrpc error: {:code=>-3, :message=>"InvalidTx(ScriptFailure(ValidationFailure(-1)))"}
from /home/ubuntu/code/ckb-sdk-ruby/lib/ckb/rpc.rb:164:in `rpc_request'

We can see our carrot script rejects a transaction that generates a cell with carrot. Now I can use this script to make sure all my cells are free from carrots!

To recap, to deploy and run a script as type script, we need to:

Compile the script to RISC-V executable binary
Deploy the binary in a cell’s data part
Create a type script data structure with the blake2b hash of the binary as code hash, any required arguments of the script code in the args part
Create a new transaction with the type script set in one of the generated cells
Include the outpoint to the cell containing the script code as one of the transaction deps

That’s really all you need. If your script runs into problems, those are the points you need to check.

Although we only talk about type scripts here, lock scripts work exactly the same way. The only quirk you need to keep in mind, is that when you create a cell with a specified lock script, the lock script won’t run here. It only runs when you are consuming the cell. So while type script can be used to create the logic that runs when you create the cell, lock script is used to create the logic that runs when you consume the cell. Given this consideration, please make sure your lock script is correct, otherwise you might be losing the tokens in the following scenarios:

Your lock script has a bug that someone else can unlock your cell.
Your lock script has a bug that no one(including you) can unlock your cell.

One tip we can provide here is always test your script as a type script attached to an output cell in your transaction. This way when errors happen, you will know immediately, and your tokens can stay safe.

Dissecting the default lock script code

With the knowledge we have, let’s look at the default lock script code included in CKB. To avoid confusion, we are looking at the lock script code as of this commit.

The default lock script code would loop through all input cells that have the same lock script as itself, and perform the following steps:

It grabs the current transaction hash via a provided syscall.
It grabs the corresponding witness data for current input.
For the default lock script, it is assumed that the first argument in witness contains the recoverable signature signed by cell owner, and the rest of the arguments are optional user provided arguments.
Default lock script code runs a blake2b hash on the concatenated binary data of the transaction hash, and all the user provided arguments (if they . exist).
The blake2b hash result is then used as the message part for the secp256k1 signature verification. Note the actual signature is provided in the first argument in witness data structure.
If the signature verification fails, the script exits with a failure return code. Otherwise it continues with the next iteration.

Note we talked about the difference between script and script code earlier. Each different pubkey hash would result in different lock script, so if a transaction has input cells with the same default lock script code but different pubkey hash (hence different lock script), multiple instances of the default lock script code will be executed, each with its own set of cells sharing the same lock script.

Now we can walk through different segments of the default lock script code:

if (argc != 2) {
  return ERROR_WRONG_NUMBER_OF_ARGUMENTS;
}secp256k1_context context;
if (secp256k1_context_initialize(&context, SECP256K1_CONTEXT_VERIFY) == 0) {
  return ERROR_SECP_INITIALIZE;
}len = BLAKE2B_BLOCK_SIZE;
ret = ckb_load_tx_hash(tx_hash, &len, 0);
if (ret != CKB_SUCCESS) {
  return ERROR_SYSCALL;
}

When arguments are included in args part of Script data structure, they are presented to the actual running script program via the Unix conventional arc/argv way. To further preserve conventions, we insert a dummy argument at argv[0], so your first included argument starts at argv[1]. In the case of default lock script code, it accepts one argument, which is the pubkey hash generated from the owner’s private key.

ret = ckb_load_input_by_field(NULL, &len, 0, index, CKB_SOURCE_GROUP_INPUT,
                             CKB_INPUT_FIELD_SINCE);
if (ret == CKB_INDEX_OUT_OF_BOUND) {
  return 0;
}
if (ret != CKB_SUCCESS) {
  return ERROR_SYSCALL;
}

Using the same technique as shown in the carrot example, we check if there’s more input cells to test. There are two differences from the previous examples:

If we just want to know if a cell exists and don’t need any of the data, we can just pass in NULL as the data buffer, and a len variable with value 0. This way the syscall would skip data filling and just provided available data length and correct return code for processing.
In the carrot example, we are looping through all inputs in the transaction, but here we just care about input cells of the same lock script. CKB named cells with the same lock(or type) script as cells with the same group. And here, we can use CKB_SOURCE_GROUP_INPUT instead of CKB_SOURCE_INPUTdenoting the syscalls to only count cells in the same group, i.e., cells who have the same lock script as the current cell.

len = WITNESS_SIZE;
ret = ckb_load_witness(witness, &len, 0, index, CKB_SOURCE_GROUP_INPUT);
if (ret != CKB_SUCCESS) {
  return ERROR_SYSCALL;
}
if (len > WITNESS_SIZE) {
  return ERROR_WITNESS_TOO_LONG;
}if (!(witness_table = ns(Witness_as_root(witness)))) {
  return ERROR_ENCODING;
}
args = ns(Witness_data(witness_table));
if (ns(Bytes_vec_len(args)) < 1) {
  return ERROR_WRONG_NUMBER_OF_ARGUMENTS;
}

Continuing down the path, we are loading the witness for current input. Corresponding witnesses and inputs have the same index. Right now CKB uses flatbuffer as the serialization format in syscalls, so if this feels weird to you, flatcc’s documentation is your new best friend.

/* Load signature */
len = TEMP_SIZE;
ret = extract_bytes(ns(Bytes_vec_at(args, 0)), temp, &len);
if (ret != CKB_SUCCESS) {
  return ERROR_ENCODING;
}/* The 65th byte is recid according to contract spec.*/
recid = temp[RECID_INDEX];
/* Recover pubkey */
secp256k1_ecdsa_recoverable_signature signature;
if (secp256k1_ecdsa_recoverable_signature_parse_compact(&context, &signature, temp, recid) == 0) {
  return ERROR_SECP_PARSE_SIGNATURE;
}
blake2b_state blake2b_ctx;
blake2b_init(&blake2b_ctx, BLAKE2B_BLOCK_SIZE);
blake2b_update(&blake2b_ctx, tx_hash, BLAKE2B_BLOCK_SIZE);
for (size_t i = 1; i < ns(Bytes_vec_len(args)); i++) {
  len = TEMP_SIZE;
  ret = extract_bytes(ns(Bytes_vec_at(args, i)), temp, &len);
  if (ret != CKB_SUCCESS) {
    return ERROR_ENCODING;
  }
  blake2b_update(&blake2b_ctx, temp, len);
}
blake2b_final(&blake2b_ctx, temp, BLAKE2B_BLOCK_SIZE);

The first argument in witness is the signature to load, while the rest of the arguments, if presented, are appended to transaction hash for a blake2b operation.

secp256k1_pubkey pubkey;if (secp256k1_ecdsa_recover(&context, &pubkey, &signature, temp) != 1) {
  return ERROR_SECP_RECOVER_PUBKEY;
}

Using the hashed blake2b result as message, we then do secp256 signature verification.

size_t pubkey_size = PUBKEY_SIZE;
if (secp256k1_ec_pubkey_serialize(&context, temp, &pubkey_size, &pubkey, SECP256K1_EC_COMPRESSED) != 1 ) {
  return ERROR_SECP_SERIALIZE_PUBKEY;
}len = PUBKEY_SIZE;
blake2b_init(&blake2b_ctx, BLAKE2B_BLOCK_SIZE);
blake2b_update(&blake2b_ctx, temp, len);
blake2b_final(&blake2b_ctx, temp, BLAKE2B_BLOCK_SIZE);if (memcmp(argv[1], temp, BLAKE160_SIZE) != 0) {
  return ERROR_PUBKEY_BLAKE160_HASH;
}

Last but not least, we also need to check that the pubkey contained from the recoverable signature is indeed the pubkey used to generate the pubkey hash included in the lock script arguments. Otherwise, someone could use a signature generated from a different pubkey to steal your tokens.

In short, the scheme used in the default lock script is a lot like the solution used by bitcoin now.

Introducing Duktape

I’m sure you feel the same way as I do now: it’s good we can write contracts in C, but C always feels a bit tedious and, let’s face it, dangerous. Is there a better way?

Yes of course! We mentioned above CKB VM is essentially a mini computer, and there are tons of solutions we can explore. We can write CKB script codes in JavaScript. Yes, plain ES5 (I know, but this is just one example, and you can use a transpiler) JavaScript.

How this is possible? Since we have C compiler available, all we did is just take a JavaScript implementation for the embedded system, in our case, duktape, compiles it from C to RISC-V binary, puts it on chain, then boom, we can run JavaScript in CKB! Since we are working with a real mini computer here, there’s no stopping us from embeding another VM as CKB script to CKB VM, and exploring this VM-on-top-of VM path.

And we can actually expand from this path. We can have JavaScript on CKB via duktape. We can also have Ruby on CKB via mruby, we can even have Bitcoin Script or EVM on chain if we just compile their VM and put it on chain. This ensures CKB VM can both help us preserve legacy, and build a diversified ecosystem.

“All languages should be and are treated equally on CKB, and the freedom is, and should be in the hands of blockchain contract developers”

At this stage you might want to ask how is this possible, and won’t VM on top of VM be slow? I believe it really depends on your use case. I’m a firm believer that benchmarks make no sense unless we put it in a real use case with standard hardware requirements. So wait to see if this is really gonna be an issue. In my opinion, higher languages are more likely to be used in type scripts to guard cell transformation. In this case, I doubt it will be slow. Besides, we are also working on this field to optimize both CKB VM and the VMs on top of CKB VM to make it faster and faster :P

To use duktape on CKB, first you need to compile duktape itself into a RISC-V executable binary:

$ git clone https://github.com/nervosnetwork/ckb-duktape
$ cd ckb-duktape
$ sudo docker run --rm -it -v `pwd`:/code nervos/ckb-riscv-gnu-toolchain:xenial bash
root@0d31cad7a539:~# cd /code
root@0d31cad7a539:/code# make
riscv64-unknown-elf-gcc -Os -DCKB_NO_MMU -D__riscv_soft_float -D__riscv_float_abi_soft -Iduktape -Ic -Wall -Werror c/entry.c -c -o build/entry.o
riscv64-unknown-elf-gcc -Os -DCKB_NO_MMU -D__riscv_soft_float -D__riscv_float_abi_soft -Iduktape -Ic -Wall -Werror duktape/duktape.c -c -o build/duktape.o
riscv64-unknown-elf-gcc build/entry.o build/duktape.o -o build/duktape -lm -Wl,-static -fdata-sections -ffunction-sections -Wl,--gc-sections -Wl,-s
root@0d31cad7a539:/code# exit
exit
$ ls build/duktape
build/duktape*

Like the carrot example, the first step here is to deploy duktape script code in a CKB cell:

pry(main)> data = File.read("../ckb-duktape/build/duktape")
pry(main)> duktape_data.bytesize
=> 269064
pry(main)> duktape_tx_hash = wallet.send_capacity(wallet.address, CKB::Utils.byte_to_shannon(280000), CKB::Utils.bin_to_hex(duktape_data))
pry(main)> duktape_data_hash = CKB::Blake2b.hexdigest(duktape_data)
pry(main)> duktape_out_point = CKB::Types::OutPoint.new(cell: CKB::Types::CellOutPoint.new(tx_hash: duktape_tx_hash, index: 0))

Unlike the carrot example, duktape script code now requires one argument: the JavaScript source you want to execute:

pry(main)> duktape_hello_type_script = CKB::Types::Script.new(code_hash: duktape_data_hash, args: [CKB::Utils.bin_to_hex("CKB.debug(\"I'm running in JS!\")")])

Notice that with a different argument, you can create a different duktape powered type scripts for different use case:

pry(main)> duktape_hello_type_script = CKB::Types::Script.new(code_hash: duktape_data_hash, args: [CKB::Utils.bin_to_hex("var a = 1;\nvar b = a + 2;")])

This echos the differences mentioned above on script code vs script: here duktape serves as a script code providing a JavaScript engine, while different script leveraging duktape script code serves different functionalities on chain.

Now we can create a cell with the duktape type script attached:

pry(main)> tx = wallet.generate_tx(wallet2.address, CKB::Utils.byte_to_shannon(200))
pry(main)> tx.deps.push(duktape_out_point.dup)
pry(main)> tx.outputs[0].instance_variable_set(:@type, duktape_hello_type_script.dup)
pry(main)> tx.witnesses[0].data.clear
pry(main)> tx = tx.sign(wallet.key, api.compute_transaction_hash(tx))
pry(main)> api.send_transaction(tx)
=> "0x2e4d3aab4284bc52fc6f07df66e7c8fc0e236916b8a8b8417abb2a2c60824028"

We can see that the script executes successfully, and if you have ckb-script module’s log level set to debug in your ckb.toml file, you will notice the following log:

2019-07-15 05:59:13.551 +00:00 http.worker8 DEBUG ckb-script  script group: c35b9fed5fc0dd6eaef5a918cd7a4e4b77ea93398bece4d4572b67a474874641 DEBUG OUTPUT: I'm running in JS!

Now you have successfully deploy a JavaScript engine on CKB, and run JavaScript based script on CKB! Feel free to try any JavaScript code you want here.

A thought exercise

Now that you are familiar with CKB script basics, here’s one thought exercise: in this post you’ve seen what an always-success script looks like, but what about an always-failure script? How small can an always-faliure script (and script code) be?

A hint: this is NOT a gcc flag-tweaking optimization contest, this is merely a thought exercise.

What’s next

I hope you have tried and successfully deployed a script to CKB. In the next post, we will introduce an important topic—how to issue your own user defined tokens (UDT) on CKB.

The best part of UDTs on CKB, is that each user can store their UDTs in their own cells, which is different from ERC20 tokens on Ethereum, where everyone’s token lives in the token issuer’s single address. All of this can be achieved by using type scripts alone. If you are interested please stay tuned :)

Did you miss the previous post? « Introduction to CKB Script Programming 1: Validation Model

Stay up to date on Nervos

Forum, Twitter, Telegram, Reddit, Youtube, Facebook