Monero Part 2: How It Works

Understanding the Monero Blockchain

Philip Shen
10 min readJul 17, 2018

Introduction

This article is a technical overview of how the Monero blockchain works. All things considered, Monero’s blockchain is pretty simple; this is because Monero’s complexity and novelty is in its privacy features, which are pretty much all application-level. If you’re familiar with Bitcoin, this should be a breeze.

This article incorporates the concepts I went over in Monero Part 1: Key Concepts.

Accounts

Key & Address Generation

Accounts in Monero have 2 keys: a spend key and a view key. As their names suggest, the spend key is used to spend funds of a Monero account while the view key is used to view funds of a Monero account. Here is how they are created:

  1. Create a random 256-bit private spend key b. This spend key is typically generated as an integer representation of a mnemonic seed (simply put, a random sequence of words). This is the key that will be used to generate all the other keys.
  2. Hash b to generate your private view key a. Monero uses Keccak-256 (pronounced “ketchack”) for hashing, which you may recognize from its use in Ethereum
  3. Generate public keys A and B from a and b through the equations A = aG and B = bG where G is the ECC base point. For those of you who care, Monero uses EdDSA key pairs based on the Ed25519 curve, a type of Twisted Edwards Curve with the equation -x² + y² = 1 + dx²y²

And that’s how you get your keys (a, A) (b, B). To generate a Monero address — which is a stealth address — from (A, B), the following steps are taken:

  1. Obtain the Keccak-256 hash of (prefix + B + A) where prefix is just the single byte value 0x12 in standard Monero
  2. Append the first 4 bytes of the result to (prefix + B + A) to serve as a checksum. This results in a 69-byte (1 + 32 + 32 + 4) sequence.
  3. Convert the result of step 2 to cnBase58, a variation of Base58 encoding that guarantees fixed-length results. In our case with 69 input bytes, the result will be 95 characters long. This is the Monero address.

An interesting property of Monero using stealth addresses is that there are no “accounts” on the blockchain; instead, there are many one-time public addresses and key images which the owners of private keys have access to. You can think of the blockchain like one of those Amazon lockers.

Integrated Addresses

In Monero, you also have the option to create an integrated address rather than the standard address. Integrated addresses are just a different way of representing a Monero address (after all, all you really need is a way to get A and B) and are different in the following ways:

  1. Different prefix byte
  2. They include an 8-byte payment id (PID) in addition to the public spend & view keys

The payment address is just an ID used by exchanges or other Monero addresses that receive a bunch of transactions. It allows the receiver to determine who the incoming transaction came from so that, for example in the case of an exchange, they know which account to credit with funds.

Implementing payment IDs is simple & there are plenty of ways to do it; as long as (1) the PID is transmitted to the receiver and (2) the sender is still able to obtain A and B to send the funds, you’re in for a good time. What integrated addresses is they “integrate” the PID into the actual address to make things easy. You could also just find the PID somewhere else and send it manually. Whatever floats your boat.

Using the Spend/View Keys

Note: This section uses a lot of the math from what I wrote about stealth addresses in part 1. Re-read that or keep it open as a reference if necessary

Now, the view key a can be used to view transactions received by a particular address. Referring to the math in the “stealth addresses” section, bearers of the view key can check a and B against the equation P = H(a * R)G + B to determine if a given transaction output P belongs to the owner of view key a.

The spend key b is required to actually send those funds observed. Basically, because both a and b are required to produce a valid key image, only someone with both can spend funds at address P. Let’s look at the math:

Recall that the key image I is computed as such: I = x * H(P’), where P’ is the address of the transaction output being spent and x is the corresponding secret key for that transaction. Because x requires both a and b in order to be generated (via x = H(a * R) + b), producing a valid key image to spend the funds at P therefore requires the spend key b.

Transactions

Accounts are very confusing stuff, especially without context. If you understood all that stuff about accounts, ring signatures, and stealth addresses these transactions should be pretty simple; otherwise, hopefully this section will help clarify those concepts.

The first thing to understand is that Monero transaction outputs, stored in one-time-use public addresses that we call P, work in basically the same way as Bitcoin UTXOs: they are transaction outputs used in the inputs of other transactions, and it is easy to verify whether or not a transaction output has been spent or not. Additionally, just like in Bitcoin, the “change” created is sent back to the sender as another transaction output. I’ll borrow this image from my article on Bitcoin to illustrate this:

Transaction Rewards

Transaction rewards in Monero work in basically the same way in Bitcoin, where the sender has the choice to increase the reward to make their transaction process faster

Example

Now, let’s walk through what happens during a Monero transaction and see what happens. Let’s say Derek is trying to send 10 XMR to Marshawn; to send the Monero, Derek will do the following:

  1. First, he uses Marshawn’s stealth address (A, B) is used to create a one-time public address P.
  2. He then collects some inputs from other one-time-addresses P that he owns. This will be the money that he sends to Marshawn.
  3. For each of those transaction inputs, he collects n public keys (remember Ring CT?) P1…Pn from the blockchain. These will form his “ring” for the ring signature, ensuring that outside observers cannot determine which P the funds were actually sent from.
  4. For each input he uses, he also produces a key image I with his private spend key. This ensures that Derek is not double spending anything.
  5. All Derek needs to do now is upload the transaction (uploading the inputs/outputs in the form of Pedersen Commitments, of course).
Simplified look at a Monero Transaction. Looking at this, you couldn’t tell it apart from Bitcoin

Going by this diagram, the old transaction outputs P1, P2, and P3 have been “spent” — that is, their key images are known to have been used — and the new transaction outputs can be used by Marshawn or Derek. All that’s left for Marshawn to do now is:

  1. Scan the blockchain with his view key and discover that this new transaction output belongs to him
  2. Spend it with his view key, in the same way that Derek did!

Blocks

Blocks are, just like transactions, essentially the same as Bitcoin (besides all the privacy stuff). They consist of 3 fields:

  1. The block header
  2. The base transaction body
  3. A list of transaction identifiers

Block Header

Here is what the block header looks like:

As you can see, there’s nothing really special here. The major_version specifies the Monero version (aka the “consensus version” or “hard fork number”). The minor_version, which is all but deprecated, and all it is now is the consensus version supported by the miner.

Base Transaction

The base transaction — which you may know better as the coinbase transaction or, as they’re sometimes called in Ring CT, the null transaction — is the reward given to the block miner. Here are the fields:

Again, all pretty straightforward. The complexity and novelty in Monero is in its privacy features, which are pretty much all application-level. The underlying blockchain is pretty generic.

Transaction Identifiers

Finally, for the sake of completion, here is the structure of the transaction identifiers:

And just like Bitcoin, Monero uses Merkle roots to save space.

Transaction Fees

Transaction fees in Monero also work pretty much the same way as in Bitcoin. The only difference is that there is a calculator and a minimum amount, which just varies based on the transaction size and overhead.

Transaction Scripts

Monero does not have scripting, unlike Bitcoin or Ethereum; instead the developers opted for a minimalistic, lightweight platform specifically for moving money around. You can read some more about it in section 6.3 of the whitepaper if you want.

Mining

Mining in Monero is much, much more dynamic than in Bitcoin. In Bitcoin, almost every value is static; mining difficulty increases at regular intervals to maintain a constant time-to-mine, the block reward is kept constant until it is halved every few years, and the maximum block size cannot change without a fork. In Monero, these things are dynamic.

Note: if anyone’s counting, Monero blocks are appended to the blockchain around every 2 minutes.

Dynamic Block Sizes

One of the most commonly discussed issues with Bitcoin is its scalability. Because Bitcoin block sizes are hardcapped and the block-mining rate is kept constant, the rate of transactions added to the blockchain doesn’t change. Many are concerned that as the number of users inevitably increases, this hard cap will cause Bitcoin transactions to take forever to be uploaded, as the network will have a huge backlog of unvalidated transactions.

Monero’s solution to this issue is to allow the block size limit to fluctuate based on prior block sizes. For every new block to be mined, the average block size of the previous 100 blocks is calculated. The miner’s reward for the newly mined block will then vary quadratically based on how the size of this block compares to this median value: if the new block size is 10% larger, the reward will be 1% smaller; 50% larger, and the reward goes down by 25%; 100% larger, and the block reward goes down by 100%.

Dynamic Block Reward

In addition to constantly fluctuating because of changing block sizes and transaction fees, the block rewards in Monero are also dynamic on their own. This is the equation for a block reward:

Which results in a graph that looks like this:

Got this from here

As you can see, the change in block reward is much smoother than Bitcoin’s halving. The reason for this is simple; when Bitcoin rewards halve, the response by users can be erratic, unpredictable, and generally chaotic; not wanting this, the Monero creators decided to smooth out the decay.

If you follow the NBA, a helpful analogy for this is cap smoothing. You don’t want every block reward-halving ending up like the 2016 offseason.

CryptoNight

Monero is mined with the CryptoNight proof-of-work algorithm. It is made for running on CPUs, which makes it ideal for Monero’s goal of achieving maximum decentralization & parity.

I’m not really going to get into it at a low level. Here’s a helpful Quora post if you want to learn more.

Consensus

Monero achieves consensus the same way as Bitcoin: orphans are abandoned and the longest chain wins.

Summary

To summarize, aside from the way accounts work the Monero blockchain is essentially the same as Bitcoin’s. Where it does differ from Bitcoin are all areas where the developers specifically wanted to improve upon Bitcion––which was, after all, the motivation for Monero in the first place. I’m going to go over those motivations now.

Scalability

Because Bitcoin block sizes are hardcapped and the block-mining rate is kept constant, the rate of transactions added to the blockchain doesn’t change. Many are concerned that as the number of users inevitably increases, this hard cap will cause Bitcoin transactions to take forever to be uploaded, as the network will have a huge backlog of unvalidated transactions. Monero resolves this by making block sizes dynamic.

Anonymity

Standard Bitcoin transactions are all but transparent, with the sender, receiver, and amount fully exposed. Besides making Bitcoin unsafe & exposing your financial activity to everybody, this also causes Bitcoin to not be fungible — that is, one Bitcoin may not be worth the same amount as another Bitcoin if one of them can be linked to known criminal accounts.

Monero resolves this by using ring signatures for untraceability (hiding the sender), stealth addresses for unlinkability (hiding the receiver), and RingCT/Pedersen commitments for hiding transaction amounts.

Proof-of-Work Parity

Because Bitcoin’s proof-of-work algorithm is easily parallelizable, not-so-approachable specialized mining hardware (ASICs) have come to dominate the Bitcoin mining scene and make its decentralization less than ideal. Here’s a picture:

Taken from here

Monero’s solution to this issue is to make mining much more accessible by through an ASIC-resistant and CPU-friendly proof of work algorithm called CryptoNight. After all, everybody’s got a CPU.

Inefficient Scripting

The whitepaper also notes that Bitcoin scripts are very space-inefficient — for example, the very frequently used pay-to-public-key-hash script OP DUP OP HASH160 OP EQUALVERIFY OP CHECKSIG , which only verifies that the correct account is claiming a UTXO, takes up 164 bytes. Monero’s scripting is smaller and a little more minimalistic.

Conclusion to the Conclusion

And that does it for my article on Monero. I might write a part 3 on its history or something. Hope you learned something.

--

--