Understanding TON and TVM

Comprehensible guide

--

DISCLAIMER: The author has no affiliation with any organization managing The Open Network (TON) or investing in it. All the information below is not intended as and shall not be understood as financial advice.

Image source https://x.com/donovanchoy/status/1787933452160979033

Toncoin ($TON), a currency introduced to the world on August 26, 2021, together with the likewise called blockchain, has emerged as a standout performer in the cryptocurrency market at the beginning of 2024. The immense increase in price was accompanied by(or based on) the enormous growth of userbase: $TON holders number surged 10 times since last year and a staggering 1,372% TVL increase.

TON DAU https://cryptoquant.com/community/dashboard/665ff8985d80604c5ccf9c65?e=6660184ad376670a9fdea304

Today I want to discuss the technical aspects of the platform which is the backbone of $TON.

Author

Vladislav Lenskii, 12th generation of Decipher
Seoul Nat’l Univ. Blockchain Academy Decipher(@decipher-media)
Reviewed By Seongheon Shin

TON TVL https://defillama.com/chain/TON?currency=USD

Toncoin is the native cryptocurrency of The Open Network (TON), a layer-one blockchain. The TON blockchain is a decentralized platform that was designed to be a fast, secure, and scalable blockchain capable of supporting decentralized applications (dApps), decentralized finance (DeFi), and other blockchain-based solutions, it is claimed to be able to handle millions of transactions per second (will be discussed later).

TON Virtual Machine (TVM) is a key component that executes smart contracts on the TON blockchain. It is specifically optimized for fast and secure processing of complex transactions and enables developers to build scalable, secure, and efficient dApps on the TON network.

If you are considering developing on TON, understanding how the TVM operates and how to design and write smart contracts in FunC (my preference) or other supported languages is key to leveraging the full potential of the network.

In this article, I will go through the following topics and will try to explain each of them in detail.

Table of Contents

  1. Basics of TON
  2. Sharding, Sharding, Sharding
  3. Messages
  4. Transactions
  5. Masterchain and Basechain(s)
  6. TVM Data Overview
  7. Development Basics
  8. Resources

Basics of TON

  • Everything is a smart contract (your wallet is a smart contract too, not just a pair of keys). Account == Smart contract.
    So any interaction with the blockchain — is governed by the logic encoded in smart contracts.
  • TON is not EVM/Ethereum compatible, it supports its own type of Turing-complete smart contracts. Contracts should be written in one of the available languages (FunC, Tact, or Fift), and their architecture differs significantly from EVM contracts.
  • TON uses a Proof-of-Stake consensus mechanism, where validators are selected based on the amount of TON tokens they stake. Validators are responsible for producing blocks and securing the network.
  • TON is a ledger of state transitions, not transactions; the data of smart contracts on the network, including account balances, smart contract code, etc., and its updates, are the focus.
  • TON has an exclusive partnership with Telegram Messenger as its blockchain infrastructure. Web3 ecosystem in Telegram currently relies mostly on TON and the TON-integrated wallets.
    Furthermore, since March 2024 Telegram solely uses the TON blockchain for payments related to a new ad-revenue sharing program.
“The TON SuperApp Trifecta” https://blog.ton.org/building-a-web3-ecosystem-in-telegram-with-toncoin

Sharding, Sharding, Sharding…

What is sharding?

Sharding, in general, is a database architecture and scalability technique that involves dividing a large database into smaller, more manageable pieces called “shards”. Each shard is a separate database that holds a subset of the data, and these shards can be distributed across multiple servers.

The primary goal of sharding is to improve performance and horizontal scalability by allowing the workload to be spread across multiple machines. This way, instead of a single database handling all the queries and data, each shard handles only a portion of the data, reducing the load and potentially increasing the system’s overall efficiency.

Example of DB sharding (key sharding). https://www.digitalocean.com/community/tutorials/understanding-database-sharding#key-based-sharding

The concept of sharding in blockchain is inspired by the idea of treating the blockchain as a type of distributed database. In both blockchains and databases, sharding is used to improve scalability by dividing a large system into smaller, more manageable parts that can be processed in parallel.

TON Sharding

TON implements dynamic sharding, enabling the network to split into smaller shards to handle a higher volume of transactions.
Special blockchain nodes called validators process user transactions. Similar to other blockchains, validators in TON collect transactions that users submit, verify their validity, and bundle them into a new block. The validator then proposes this block to the network.

The efficiency of the TON Blockchain lies in the fact that as the number of users or load grows, it can split into “sub-blockchains,” called shardchains. If the load on one of the parts continues to grow, shardchain is divided in half again, and this process continues as needed. Each shard chain is operated by its own group of validators, who distribute the load. When the load decreases, the shard chains “collapse” back together. This adaptive model allows for the creation of as many shards as needed at a given time.

The image below shows that shardchains consist of account chains (chains of transactions on a single account). It means that each account belongs to one of the currently existing shardchains, the specific shardchain to which the account belongs might change dynamically following dynamic sharding.

TON Sharding illustration https://docs.ton.org/develop/blockchain/shards

The theoretical limit is 2⁶⁰ shards for one Workchain, which makes the number of shards practically unlimited as long as you have enough validators (1 shard requires at least 1 validator). Currently, there are 398 active validators.

Messages

An event on the TON account causes a transaction (I will provide the definition later). The most common event is the “arrival of some message”, but generally speaking there could be tick-tock, merge, split, and other events. I will cover only the message and split events in this article as the most important ones.

What is a message?

“A message is a packet of data sent between actors (users, applications, smart contracts). It typically contains information instructing the receiver on what action to perform, such as updating storage or sending a new message.” — TON docs say, and that describes it pretty well, IMHO.

https://docs.ton.org/develop/smart-contracts/guidelines/message-delivery-guarantees#what-is-a-message

Contracts can communicate with each other (case 1) only by sending messages. The outside world can communicate with contracts, and thus blockchain (case 2), only by sending messages. The two cases mentioned above should be handled by different types of messages. So, there are two types of messages: internal (the 1st case) and external (the 2nd case).

External and Internal Messages. Source: Decipher.

Internal Messages

Internal messages are sent from one blockchain entity to another. Such messages may carry some $TON and pay for their execution. When an internal message reaches its intended destination it is processed as specified by the code and the current data of this account (smart contract). Smart contracts that receive such messages may reject (bounce) it. In this case, some gas will be deducted from the message value and the remainder will be sent back to the sender, as another (bounced) message.

The processing of any message can create one or several outbound internal messages. This can be used to create simple applications when a request is included in an internal message and sent to a smart contract, which processes the request and sends back a response as an internal message.

There are no restrictions on the message body (data sent in the message). Nevertheless, there are recommendations on the message body structure, a kind of non-official standard (you can hardly see messages sent for execution that do not follow it). The structure should be as follows:

  • A 32-bit unsigned integer `op`, identifies the `operation` to be performed inside the smart contract. It specifies how to treat the following data of the message.
  • A 64-bit unsigned integer `query_id`, is used in all query-response internal messages to track responses and better indexing of the messages (finding errors, connecting the source with result, etc.). It can be omitted if the message doesn’t evoke any further messages.
  • The remainder of the message body is specific for each supported value of `op`.

All integers above are Big-endian. Simple $TON transactions are also messages, but they do not need to follow the above structure (can have an empty body) or can have a comment attached by having op equal to 0 (no need for query_id after) and UTF-8 encoded text comment following it. There is also a way to encrypt the comment.

Internal messages are handled by the recv_internal() function in FunC smart contracts.

External Messages

External messages are sent from the outside world to the smart contracts to trigger the execution of certain actions. You are sending such a message to your wallet contract every time you perform some operation in the wallet app (sending $TON, interacting with Dapp). The messages to your wallet contain orders to send internal messages from it. These messages are signed by your wallet’s private key using the EdDSA with the Ed25519 curve (nuances of cryptography are a topic for another article).

Signing an external message with the $TON transfer command using https://tonkeeper.com/

After the message signature is checked, the wallet contract compares seqno with the one specified in the message. This is the most common practice for protection against message replay attacks (another way will be to set a time limit for message execution).

If the checks are passed, orders, specified in the message, get executed.
The gas fee for the processing is paid by the contract receiving the message.

The structure of external messages does not need to be standardized, because external messages are not used for interaction between different contracts, written by different developers.

Transactions

Definition

Now it is important to clarify what is called a “transaction”. The act of an account receiving a message from another account, processing it, updating its own state, and sending outgoing messages (optional) is called a transaction. Sending the incoming messages and processing outgoing messages are considered parts of a separate transaction (they can be placed in the same block though).

Noting this is important since some devs (including me) often tend to call a bunch of consecutive messages (from the first internal one initiated by an external message to the final ones, the execution of which doesn’t emit any outgoing messages) a transaction. While it might be comfortable to use that notation, it is conceptually wrong.

The concept is illustrated in the image below, where transaction `0` is over as soon as the out msg 1 and out msg 2 messages are sent out. Processing of these messages (out msg 1 and out msg 2) are parts of separate transactions.

https://docs.ton.org/develop/smart-contracts/guidelines/message-delivery-guarantees#what-is-a-transaction

Asynchronous processing.

To achieve the infinite sharding paradigm, full parallelization is required, which means that the execution of each transaction is independent of every other. Therefore, instead of transactions that affect and change the state of many contracts at one time (ex. in Ethereum, Solana), each transaction in TON is only executed on a single smart contract. That way, smart contracts can only interact with each other by calling their functions with special messages and getting a response to them via other messages later.

Contracts have no visibility into anything outside themselves. By isolating contracts from each other, TON is infinitely scalable.

It reminds me of the real-world example of computers on the Internet. Each of them knows only about its current state and sends messages to the outer world, not knowing how, when, and if it will be processed.

Thus, transactions are executed in an asynchronous system, where you can not get a response from the destination smart contract in the same transaction. A contract call may take a few blocks to be processed, depending on the length of the route between the initial and final message.

That shows us an important feature of the TON blockchain:

Although the network guarantees that any internal message will be delivered, it doesn’t guarantee how long this will take.

It also doesn’t guarantee which message will be executed first in most of the cases.

Examples

For example, when we send messages from contracts B and C to the contract A, we cannot be sure that the message from the account B will be executed before the message from the account C, even if it was sent earlier.

Diagram for the above example. https://docs.ton.org/develop/smart-contracts/guidelines/message-delivery-guarantees#delivery-order

Another situation is when we send messages 1 and 2 from the contract A to the contract B , and message 1 was sent first. Here we know that message 1 will be executed first.

Diagram for the example described above. https://docs.ton.org/develop/smart-contracts/guidelines/message-delivery-guarantees#delivery-order

The difference between the two situations is that in the first example, we do not know how much communication between shards will be necessary to process the transactions, so the order of message delivery may be arbitrary.
In the second case, messages come from one source, thus the order of their sending determines the order of execution.

Another example I want to show you is a real-world transaction. These are the contract deploy transactions triggered by me not so long ago. Below you can see the diagram.

Transaction flow diagram https://tonviewer.com/transaction/361e406f5c949cde5b2f70a38caf439b0b5f63b5a1b8ce49aed07cee0093748e

Let me briefly explain it: Account A is my development wallet, account B is the contract I was deploying and account C is the Jetton Minter contract (explained later), from which account B requested some info right after it was deployed.

Excess branches are transactions with a standard op_code for returning the unused $TON. The 0x... symbols are the opcodes specified in the Jetton Minter contract.

What is interesting here is that my wallet account and deployed contract were located in different shardchains. You can see it by checking the Block Id field of the transaction info.

Info of transaction that executes external message to my wallet ( -> A). https://tonviewer.com/transaction/3719eab6922fe31aaedfc5ecaad82135d0fc2ce8fd533423863673ff4c9f0a63

The next noteworthy point is that when contract B sent messages to contracts A and C they were executed simultaneously. You can check it by looking at the Lt field of both transactions (navigate between them by clicking the account circles on the diagram), it equals 48366996000001. This happens due to the fact that the accounts were located in the same shard and transactions were executed in the same block of this shard.

Hope these examples helped understand the concepts, presented above. However, there are the concepts I’ve missed or, rather, avoided so far.

Masterchain and Basechain

On the image regarding the sharding, you might have seen the terms Masterchain and Workchcain. I guess now it is time to talk about them.

A Blockchain is an aggregation of all shards including all accounts behaving by one set of rules. In TON there can be many sets of rules and therefore many blockchains which operate simultaneously and can interact with each other by sending messages cross-chain in the same way that accounts of one chain can interact with each other.

Workchains

So if you want to create your own rules for a set of Shardchains you need to create your own Workcchain. The only problem is, that it is an extremely challenging task both technically and financially. Even if you can endure and succeed you still need the approval of 2/3s of validators.

There can be up to 2^32 Workchains, each divided into a maximum of 2^60 shards, as mentioned before.

However, at the moment of writing the article, there are only 2 live Workchains: Basechain (everything I was talking about before happens on it) and Masterchain. That was one of the reasons I was avoiding this concept before, there are simply more practically important ones.

BaseChain is used for everyday transactions (aka. execution chain), while MasterChain has the crucial function of storing the network configuration and the final state of all other workchains (aka. state chain). In other words, it is used to fix some ‘point’ in a multichain state and reach a consensus about that state. It enforces consensus by keeping a record of all the latest block hashes and a list of active validators with their stakes.

Illustration of Masterchain controlling the state of Basechain. https://docs.ton.org/develop/blockchain/sharding-lifecycle#sharding-example

Blocks of Masterchain contain information about all other Workchains in the system, so the state of all Workcchains is determined in a single Masterchain block. Common users seldom interact with the Masterchain directly due to its high cost.

Since the Masterchain oversees multiple other blockchains — Workchains, TON is called the “blockchain of blockchains”.

So far we learned, that the TON blockchain (Basechain) can be split into parts (shards) which are processed by different sets of validators. Because of this, user operations are not executed atomically (containing multiple transactions) and it is hard to predict the time and, sometimes, order of execution.
Now I want to briefly talk about another important technical aspect which is the core TVM principles for handling data.

TVM Data Overview

First of all, TVM is a last-input-first-output stack machine. At any given moment, the TVM state is fully determined by 6 properties:

  • Stack — data used for processing
  • Control registers — simply, up to 16 variables which may be directly set and read during execution
  • Current continuation — an object that describes a currently executed sequence of instructions
  • Current codepage — the version of TVM which is currently running
  • Gas limits — a set of 4 integer values to control the gas consumption
  • Library context — the hashmap of execution libraries which can be called by TVM

Cells and other data types

TVM operates based on a unique “bag of cells” model to represent data. Each cell can contain up to 128 data bytes and can have up to 4 references to other cells. A cell and a 257-bit integer are the two fundamental data types in TVM. Both of them are supported by the stack (can be popped out and pushed into it).

From TON Docs

In total, 7 types of variables may be stored in the stack:

  • Integer — signed 257-bit integers
  • Tuple—ordered collection of up to 255 elements having arbitrary value types, possibly distinct.
  • Null

are non-cell ones, and

  • Cell — basic (possibly nested) opaque structure used by TON Blockchain for storing all data
  • Slice — a transformed cell, prepared for reading
  • Builder — a special object to create new cells
  • Continuation — a special object that allows you to use a cell as a source of TVM instructions

are the different representations (“flavours”) of cells.

This structure allows the TVM to natively support arbitrary algebraic data types, structures, and more complex constructions such as trees or directed acyclic graphs (DAGs) directly within its storage model.

The important trait of storing data in cells is deduplication: if there are any duplicate cells in storage, blocks, transactions, or messages such cells are stored only once, seriously decreasing the size of serialized data. This is possible due to the cells having a deterministic hashing schema.

The last topic I want to cover is the basic knowledge required for development on TON.

Development Basics

Development tools

Developers who wish to develop on TVM will first have to learn one of the available smart contract languages: FIFT, FUNC, or Tact (listed in the order of less to more abstraction and ease to learn).

While Tact is constantly developing and is expected to become the main TON smart contract language in the future, most of the seasoned developers(including me) prefer FunC now. As a result, currently, there are a magnitude less examples of contracts written in Tact, compared with the FunC.

Regardless of what language you choose for development, you need a framework to compile, test, and deploy your contracts. Currently, the most used one is the Blueprint SDK. It comes with Typescript, Tact, FunC, and FIFT support and offers the best toolset for the development, testing, and deployment of contracts.
Alternatively, developers can go with a custom setup (single/raw tact or func compiler + libraries for blockchain integration), but I do not recommend creating more hardship with already not easy Ton development).

Both smart contracts’ syntax and the way of using dev frameworks can be learned from other resources. Thus, they are out of the scope of this article.

The really important thing is the architecture of smart contracts, which is dictated by the sharding paradigm of TON.

Jetton Architecture

Let me explain the nuances using the Jetton (TON fungible token) standard (TEP 74) example.

Jetton standard specifies the architecture, functions, and operation codes, that should be handled by the Jetton contracts’(!) code.

First of all, Jetton should be managed by 2 smart contracts (!): the Jetton Wallet smart contract and the Jetton Minter (Master) smart contract.

The Jetton Minter smart contract stores general information about a Jetton: the total supply, a metadata link (off-chain storage), or the metadata itself (on-chain storage), admin (owner) address, if a Jetton is mintable or not.

To mint the jettons (if Jetton is mintable) the admin usually sends a message (internal) to the Minter, that specifies the amount and receiver of the minted jettons.

The Jetton Wallet contracts are used to send, receive, and burn jettons. Each Jetton Wallet contract stores a Jetton balance for a specific user. Thus, there is one Jetton Wallet contract per one user per one Jetton.

https://docs.ton.org/develop/dapps/asset-processing/jettons#jetton-architecture

It is important not to confuse Jetton Wallets with user wallets used for blockchain interaction and storing only $TON. I will repeat, the Jetton Wallet contract is responsible for managing only a specific Jetton for a specific user.

When minting, the Jetton Wallet, specified by the admin, receives the message from the Minter contract. When transferring jettons, the Jetton Wallet receives a message from the owner and sends a message to another Jetton Wallet. When burning, the Jetton Wallet receives a message from the owner and sends a message to the Jetton Minter. Thus, users interact only with their Jetton Wallets for transactions.

But how does the user know the address of his Jetton Wallet?
The calculation of the Jetton Wallet address is done using the Jetton Minter address and the owner's wallet address. Thus, the user’s Jetton Wallet address is predetermined (as the address of almost any TON contract), you can know it even before the contract is deployed (usually by the mint or the Jetton transfer transaction).

So far, I’ve covered the most important and basic (IMHO) concepts of TON and TVM and tried to explain them as comprehensively as I could.
Hope the article was helpful!
If you have any comments or requests for clarification, please leave them in the comments. Thank you!

Resources

--

--