Understanding Ethereum From the Ground Up (Accounts)
Preface: The following article aims to consolidate knowledge, as I learn about Ethereum. If any information is incorrect, partially incorrect, or incomplete, I wholeheartedly welcome your input.
Article Structure: In this article, we’ll cover:
- The Ethereum network, from a high-level system perspective
- Ethereum Accounts, as fundamental entities
- Account State, as distributed data storage
Foundation: As a foundation for this article, it’s helpful to have a general, high-level understanding on blockchains. It’s also helpful to know that at present, Ethereum uses a Proof-of-Work-style (PoW) blockchain to add new grouped data entries (blocks) to its immutable ledger of record. If you’re a bit fuzzy on Ethereum or more generally, blockchains, the following are great resources to build a foundation:
Foundational Questions
- What’s Ethereum? I define Ethereum as a network of computers that follow very specific rules, which instruct how to maintain and add to a distributed “ledger of record” (database) of interactions (transactions) between entities called, “accounts”. How are the rules implemented? The rules are implemented by the “Ethereum clients” (software) that are installed on the computers in the network. Together, the computers that run the “Ethereum software” (clients) are what we call, “nodes”. Okay, so we know that the software implements “data entry” rules, say, but where is the data (state) stored? A client installation will also install and utilize a key-value storage database, such as levelDB, to store relevant state (more on data storage below). To be clear, the state data resides in distributed key-value storage databases within clients, running on full nodes within the network. There are different types of nodes (full nodes, light nodes, and archive nodes), but we’ll cover those another time. To better illustrate this idea, see Figure A below.
Figure A:
- Isn’t Ethereum a state machine? I’m sure you’ve come across the more abstract definition, “Ethereum is a transaction-based state machine”. The following aims to clarify this definition. What’s a state machine? A finite-state machine is a machine that can be in exactly one of a finite number of states any given time. As a simplified example, at any given time, an oven can only be in one of the following states: off, heating, or idling. Looking at the diagram below (Figure B), a state machine diagram includes the states, the transition between states (arrows), and rules for triggering those transitions (the labels on the arrows).
Figure B:
- In what sense is Ethereum a state machine? Transactions are verified and executed by each network-node. Subsequently, each node needs to update its state in accordance with the new set of transactions. As an example, if Bob sends 10 Ether to Alice, then both Bob’s and Alice’s account balances (state) will need to be updated to quantitatively reflect the transaction. It is in this sense that transactions act as the “triggers” for updating machine state, hence the term, “transaction-based state machine”. What’s the “machine” though? The machine, in this sense, is all of the nodes in the Ethereum network, each independently updating their states, in accordance with new “blocks” of transactions. Dissimilar to the above oven scenario (a finite-state machine with three states), Ethereum could be classified as a “practically unbounded” state machine, as the number of potential states is practically unbounded.
Figure C:
Ethereum Accounts (Fundamental Entities)
In Ethereum, there are two distinct types of accounts: “externally-owned” accounts and “contract” accounts:
- Externally-owned accounts: accounts, normally controlled by humans, with private-public key pairs that enable the authorized (and verifiable) transfer of tokens.
- Contract accounts (smart contracts): autonomous accounts with code deployed to the Ethereum network, with functions that can be invoked. As an imperfect Web2 analogy, contract accounts can be thought of as possessing their own APIs, with endpoints (smart contract functions) to hit.
As the Ethereum Accounts documentation states, both account types can receive, store, and send Ether (the Ethereum network’s native token) and other tokens. Both can also interact with deployed smart contracts.
Externally-Owned Accounts
In some general sense, as a Web2 analogy, one can think of externally-owned Ethereum accounts as similar to social media accounts or bank accounts (entities that perform actions that get recorded somewhere). In essence, an externally-owned Ethereum account is a private-public key pair, from which an Ethereum public address can be derived. Actually, the root of the derivation starts at the private key. Once you have a private key, you can derive a public key using the Elliptic Curve Digital Signature Algorithm. From there, a public address can be derived by taking the last 20 bytes of the Keccak-256 hash of the public key and adding 0x
to the beginning. To be clear, the following comprise an externally-owned Ethereum account:
- A private key (64 hexadecimal characters)
- A public key, derived from the public key (128 hexadecimal characters)
- A public address, derived from the public key
Contract Accounts
Contract accounts, on the other hand, do not have private-public key pairs. They do, however, have public addresses, which are derived from the public address of the contract creator (the sender) and the sender’s nonce (the number of transactions the sender’s account has sent). More specifically, the sender and the nonce are RLP encoded and then Keccak-256 hashed.
Account State
Okay, fine, but I thought Ethereum accounts had state, such as an ether balance, associated with them? They do! The key words here are “associated with”. In Ethereum’s world state, which can be found in each node’s state database, there exists a key-value mapping between Keccak-256-hashed public addresses and “RLP encoded” account objects. Remember the difference between encryption and encoding! Encryption hides data, only to be made accessible with some sort of key. Encoding, on the other hand, simply converts data into a more utilizable format (for example, base64 encoding JSON to store as an environment variable string, or similar).
The Account State Object
Each Ethereum account has its own account state object, σ. To simplify, we can say each public address has its own state object, σ[a], where a is a public address:
- nonce : In the case of an external account, the nonce is a scalar value equal to the number of transactions sent from this public address. In the case of accounts associated with code (contract accounts), the nonce is a scalar value equal to the number of new contracts created by this account.
- balance: a scalar value equal to the number of Wei (1E-18 Ether) owned by this address.
- storageRoot: For contract accounts, A 256-bit hash of the root node of a Merkle Patricia tree that encodes the storage contents of the account. In other words, this value is the root hash of a whole, separate data structure (stored within nodes’ state databases) that stores any persistent data (arrays, integers, objects, etc.) that an account’s contract code can utilize. For external accounts, however, the storageRoot is empty — external accounts have no code associated with them, which obviates the need to store persistent, code-utilizable data.
- codeHash: For a contract account, this field stores the Keccak-256 hash of its contract code — the code that gets executed should this address receive a call or transaction; it is immutable and thus, unlike all other fields, cannot be changed after construction. All such code fragments are contained in the state database under their corresponding hashes for later retrieval. For external accounts, on the other hand, the codeHash is a hash of an empty string — external accounts have no code associated with them.
Figure D:
Conclusion
While there are many outstanding questions, I hope that this article provides a reasonable foundation for your learning. As previously mentioned, if anything is incorrect or worth adding to, I invite you to leave your input in the comments, so that this document can be improved. Since we’ve covered some high-level concepts on Ethereum’s fundamental entities, we’ll discuss how data is stored within the Ethereum network next.