Writing an Ethereum Wallet for Fun

Up-front Expectation Management: this article is not meant as a coding tutorial, but rather as an exploration of wallet fundamentals. I will not be pasting a lot of code into this article, but I have uploaded the finished project to GitHub, and link to it at the end of the article.

I use Jaxx as my mobile Ethereum wallet, and I noticed it occasionally acting oddly.

It would show my balance correctly, but occasionally the transaction history would have missing transactions. Other times it would show duplicate transactions. Sometimes it would show the right transactions, but the overall balance was wrong. However, if I tried to send a transaction, it would suddenly know my real balance.

While I don’t have huge Ethereum reserves, it is still disconcerting when anything storing value appears to be unreliable.

I should mention that despite the user interface issues, Jaxx has been otherwise very reliable.

Let's take a peek behind the scenes, and try and write our own wallet. Can make a wallet without such user-interface inconsistencies?

Prerequisites

I assume you know a bit about Ethereum (or cryptocurrencies in general). Specifically:

Asymmetric Encryption (private/public keys)

You should know what private/public keys are.

Types of Ethereum Accounts

Ethereum supports (at the time of writing) two types of accounts - externally owned, and contract accounts.

We will ignore contract accounts - smart contracts don't have, and don't need wallets. The type of account every human Ethereum user has, is the externally owned accounts.

Each externally owned account has a corresponding private key somewhere, with which the account's transactions are signed. An Ethereum wallet - at a fundamental level - is basically a user-interface for using a SECP256k1 private key to sign transactions and submit them to the network.

While you need a private key to send a transaction, not each account necessarily has a private key somewhere. It is possible to send Ethereum into an account for which nobody has the private key. These funds will become unusable.

Transaction Types (Value Transfer vs Function Calls)

Ethereum supports a type of application called a smart contract. Each smart contract can have one or more functions, which accounts can call, and which perform some type of work.

Each smart contract function can have input parameters, and return a value, or emit events.

It gets complicated, and most of this functionality is not relevant for a lot of people just wanting to hold and transfer ethers around. So we will focus on the simplest type of transaction — just transfering some amount of ether from person A to person B, and paying the fee.

This is the minimum functionality a wallet should support.

A more advanced wallet might support calling contract methods, specifying input data, manipulating results, etc. We will ignore this for now.

How Transaction Fees are Calculated

Creating an Ethereum transaction is not free. However, there is no fixed price either. Instead, there is a market-based mechanism for setting the price.

Every action on the Ethereum network consumes an intermediary resource called gas. Two identical transactions (under identical circumstances) will consume an identical amount of gas. For example, 21000 gas units (which is a common upper bound for pure value-transfer transactions).

To determine the priority in which Ethereum miners will process these transactions, they will look at the fees each transaction sender is willing to pay, and process the higher paying transaction first.

Fees are determined by the transaction sender specifying a price for each gas unit (in ethers), and the maximum amount of gas that they are willing to pay for the transaction.

The transaction fee is thus gas amount times gas price.

To summarize:

  • every transaction consumes an intermediary gas value;
  • every transaction sender specifies a) how much gas they are willing to spend, and b) how much they are willing to pay per gas unit;
  • miners will prioritize transactions with the highest fees.

For a more detailed explanation see “What is Gas?”.

Let's Create a Wallet

Let's create a simple wallet to get a better grasp of how they operate. I'll be using .NET:

$ dotnet new console -o wallet
$ cd wallet

What now? What should be the first thing we implement?

First Thing's First

The core of the wallet is a private key. Ethereum uses elliptic-curve cryptography (ECC).

What are private keys like in ECC? Let's consult the standard SEC-1.

… an elliptic curve key pair (d, Q) [...] consists of an elliptic curve secret key d which is an integer in the interval [1, n − 1], and an elliptic curve public key Q = (xQ, yQ) which is the point Q = dG

So a keypair consists of two values - d and Q. d is the private key, which is simply a very large integer, and Q is the public key, which consists of two values.

However, it is apparently not enough to just have a very large integer for a private key. It must be in a specific range. But what is n? And what is G, which relates d to Q?

There are two interesting takeaways here:

  • the way ECC keys are interpreted and used depends on *domain parameters*, which must be agreed upon by both sides in a cryptographic transaction. If Alice signs a document with her private key using one set of domain parameters, and Bob tries to verify the signature using Alice's public key but using a different set of domain parameters, then the verification will fail.
  • the values n and G are part of the domain parameters, which are known. This means that if you have the value d, then you can always derive Q. You never need to store the public key.

Ethereum uses the elliptic curve called secp256k1. Domain parameters for this curve can be found in SEC-2. For example, n is:

FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF BAAEDCE6 AF48A03B BFD25E8C D0364141

This means that these are invalid private keys (they fall outside of [1, n − 1]):

FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF BAAEDCE6 AF48A03B BFD25E8C D0364142
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF BAAEDCE6 AF48A03B BFD25E8C D0364141
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

But these are valid private keys (they fall within [1, n − 1]):

FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF BAAEDCE6 AF48A03B BFD25E8C D0364140
12121234 12121212 12121212 12121212 12121212 12121212 12121212 12121212
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
Many other cryptocurrencies use secp256k1 keys. You could add e.g. Bitcoin support to your wallet, without having to generate different keys.

Let's check if this works. A good library for working with Ethereum in .NET is Nethereum, so let's add this dependency (as well as its own dependency — BouncyCastle):

$ dotnet add package Nethereum.Portable
$ dotnet add package Portable.BouncyCastle
$ dotnet restore

Now let's see what happens if we try to load an invalid Ethereum key for use with Nethereum:

Let's run it:

Hmm. It worked…

I expected one of these methods to throw an exception, but they didn't. I suspect the math may work regardless of the d size, and nobody has noticed that the specification stipulates a range.

Interestingly, the private key output had added an extra "00" at the begining of the hex string.

Let's see what happens with the key 0x0:

Exception when calculating the public key. Good!

I guess the math doesn't work so well in this case.

Now that we know a bit more about what the key consists of, we can delegate the actual generation to Nethereum, and move on with creating the wallet:

Sidenote: What is an Ethereum Address?

You may have noticed that we've mentioned only private and public keys so far. However, when talking about Ethereum transactions, nobody mentions public keys - transactions are sent to addresses.

So what is an address?

This is mentioned in the Ethereum yellow paper:

For a given private key [...] the Ethereum address [...] (a 160-bit value) to which it corresponds is defined as the right most 160-bits of the Keccak hash of the corresponding ECDSA public key

In other words:

  1. hash = keccak-256(Q) # Hash the public key using Keccak256
  2. address = hash[12:] # Skip the first 12 bytes and take the remaining 20
An Ethereum address is not synonymous to the corresponding ECC public key. However, for all intents and purposes it functions as a public key in the context of Ethereum transactions.

As for the rationale of using addresses - it is very likely done to save space. Extra 12 bytes per transaction add up when you have a global network with tens of thousands of nodes, and a global community of people sending transactions.

Here is an interesting Stack Overflow response on the topic.

How Will the Wallet Communicate with the Network?

The private key is used to sign some data, and then push this to the network. To communicate with the network, we need access to an Ethereum node.

We're not going to write a sophisticated wallet, so let's use a public Ethereum node, which exposes the Ethereum API RPC calls.

Infura offers such a service. Its Rinkeby endpoint is https://rinkeby.infura.io:443.

Rinkeby is an Ethereum test-environment, which emulates the main Ethereum network, but has subtle differences. One of which is — ether is free on Rinkeby.

Every time we need to communicate with the Ethereum network, we will be sending JSON API calls to this endpoint.

Here is an overview of the methods available via the API: Ethereum JSON RPC API.

Connecting to the Network

Now that we have generated a key for our wallet, we are ready to start receiving transactions. We need two things:

  • some ether in another account that we could send over to our freshly generated key and
  • to connect to the network, and validate the balance.

Acquiring Some Test Ether on Rinkeby

Since you don't need to pay for ether on Rinkeby, you can go to the Rinkeby Faucet and request some.

I created a test account on Rinkeby using MetaMask, and requested some ether using the faucet and a Google Plus post.

Connecting to the Node via Code

Nethereum exposes the Ethereum API via its Web3 class:

Receive Ether and Verify the Balance

Note: once you have generated an account in code, mark the latest Rinkeby block number at the time before you send any ethers to it. This will come in handy soon.

When you receive your free ether to the MetaMask account, send a small transaction to the address generated by the wallet. Let's say - 0.1 ethers.

Let's add some code to display our balance:

We should see

Balance: 100000000000000000

I.e. the amount we sent earlier. You may have two questions:

  • what does the block parameter mean?
  • we sent 0.1 ETH. Why is the displayed balance so huge?

The Block Parameter

Each address has a history of values associated with it. At the beginning of the blockchain - the genesis block - your address likely had a balance of 0 (unless it was included in the genesis block). As new blocks are added, the balance can increase and decrease.

While a wallet normally cares only about the latest balance, the API supports querying the balance at any given moment in history. So in this case we specify that we want the balance as of the latest known block.

If the node you are connected to has not fully synced with the rest of the network, you may get an incorrect value. This is because "latest" is a relative term. If your node has only synced 5 blocks, then "latest" will be 5, whereas the longest chain might have millions of blocks already.

The Large Balance Value and Denominations of Ether

As for the huge balance value — it is correct, and just how it should be. It just is not given in terms of ether, but rather its smallest denomination - wei.

Programmatically, every API call expects and returns wei-denominated values.

There are multiple Ethereum denominations. Check out the Ethereum Unit Converter for a list of them all.

The most comon ones in my experience are wei, gwei, and ether.

  • wei is the smallest denomination - 1 wei is equal to 1E-18 ethers, because Ethereum has a precision of 18 decimal points;
  • gwei is halfway between wei and ether. 1 gwei is equal to 1E-9 ethers.

So if we sent 0.1 ethers to our wallet, then we should have:

0.1 eth => 0.1 * 1E18 wei => 1E17 wei => 100000000000000000

Since a wallet is supposed to be user-facing, and wei is not a human friendly denomination, then convert this value:

Now we should see:

Balance: 0.1 ETH

So far so good. Let's add transaction history!

Listing Transactions

It is easy to get the transaction history for our address.

Just call the RPC method eth_getTr... just kidding!

Actually, there is no easy way to retrieve a list of transactions from/to a given address.

So what can we do? We must send an RPC request to load a batch of blocks, and scan every included transaction. Just call the RPC method eth_getBlockBat... just kidding!

There is no way to retrieve information about multiple blocks in one request.

So what we actually have to do is:

  • keep a pointer to the last processed block (e.g. starting at 0);
  • call eth_getBlockTransactionCountByNumber with our block index;
  • iterate over all transactions in the block using eth_getTransactionByBlockNumberAndIndex;
  • for each transaction, check its from and to fields. If any match our address, then store information about this transaction (at the very least you must store the transaction hash value so we can always look up the other information);
  • after processing the current block, increment the block index pointer;
  • repeat until you reach the latest blockchain block, whenever you need up-to-date information.

As you can see, this is quite a few RPC calls per block. And there are a lot of blocks to process. Every 15 seconds or so (at the time of writing) a new block appears, meaning that when you stop continuously monitoring for transactions, your wallet immediately becomes out of date.

This is a major source of potential discrepancies between the balance (which we can get with a single RPC call to eth_getBalance) and the sum of all known transactions (which we must manually extract by traversing every block).

Just for fun, here is a scenario for maximum confusion:

  • suppose a blockchain has 100 blocks;
  • you connect to a node that knows about 50 blocks because it has not finished synchronizing with the rest of the network;
  • every 10 blocks you have received a transaction with 1 ether - at blocks 10, 20, 30, 40, 50, 60, 70, 80, 90, 100;
  • you request your balance for the "latest" block, but have finished iterating only through 25 of the 50 blocks your node knows about.

What happens?

  • You know you should have 10 ether in your account;
  • Your wallet shows you have 5 ether in your account;
  • The sum of your transaction history only accounts for 2 ether.
We can not avoid the discrepencies if the node has not finished synchronizing. However, we can avoid part of the confusion by requesting "balance" at the time of the last block we have synchronized, instead of the “latest” known block. This way the balance and transaction list will be in-sync.

In Nethereum, the sync loop looks something like this:

Optimization note: we can optimize by only starting to iterate through blocks at the time of wallet creation. There is a possibility that we might miss accidental transfers, which happened before we actually generated the private key, but this is astronomically unlikely.

This is why — before sending your first transaction — you had to take a note of the latest block number. So we would know where to start looking through blocks for relevant transactions.

Sending Transactions

The final piece of the puzzle is sending transactions. There are a couple of things we need to account for, in order to send a transaction. We will need to:

  • know the recipient address;
  • know the amount to send the recipient;
  • know the amount of gas we are willing to spend on this transaction;
  • know the amount we are willing to pay for each gas unit;
  • know the transaction nonce;
  • decide on the correct API calls (transaction vs raw transaction).

Let's unpack these.

The Straightforward Values

  • recipient - this is pretty straightforward - a 20 byte value in hex notation;
  • amount to send - you need to know how much to send. Keep in mind that you can not send more than you have in your account at the time of "latest" block. However - you CAN send a transaction through a node that thinks your balance is 0, if your actual balance is sufficient. The node will put the transaction in the pending transaction pool (despite thinking you can not afford it), and the miners will know the correct balance.

Gas Fees

Each transaction needs to be paid for using gas, and gas needs to be paid for using ether.

Remember: gas is not ether.

We could ask the user for both the gas allowance, and the price they are willing to spend on it. However, since our wallet will not support smart contract functions (where the gas requirements can vary wildly), we can simply set the gas allowance to 21000.

We will need to ask the user how much ether they are willing to spend per gas, however. This will be multipled with the amount of actual gas expenditure and deducted from the sender's account. This fee goes on top of the amount that is being sent.

Here is an example calculation of the total required balance when sending a transaction:

  • suppose you are sending 1 ether;
  • suppose you are willing to spend up to 21000 gas;
  • suppose you are willing to spend 25 gwei per gas;
  • this means the maximum fee will be 21000 gas multiplied by 25 gwei/gas;
  • this is the same as 21000 multiplied by 25E-9 ether, or
  • 0.000525 ether, or
  • roughly 30 cents (at a market price of $600/ether);
  • since this fee is added on TOP of the value being sent, then the minimum balance in your account must be 1.000525 ether. Any less, and you will not be able to send this transaction.

What is the Nonce?

The nonce is a number that increases with every transaction. This prevents your transaction being processed multiple times by malicious actors, or wallet bugs. Each new transaction must have a nonce that is larger than the previous transaction's nonce by 1 (starting with 0, if there are no previous transactions).

Use eth_getTransactionCount to calculate the nonce for a new transaction.

You need to specify either “latest” or “pending” for the block parameter. These choices have different effects:

  • “latest” — the function will count all successfully processed transactions. However, what happens if you have a pending transaction out there? The new transaction will have the same nonce as the pending transaction. Only one of these transactions will succeed (likely the one with the largest gas price);
  • “pending” — the function will count all successfully processed and pending transactions. Keep in mind, though, that the node you connect to will not necessarily know about your other pending transactions (it likely will, but this is not certain).

What will happen if you have a pending transaction in the transaction queue, and submit a new transaction with a higher nonce (by using the “pending” parameter to eth_getTransactionCount) and a higher gas price?

I previously mentioned that miners will always prioritize the transaction with the highest fee. In this case it will not happen — because it would violate Ethereum rules. The newest transaction, with the highest gas price, would NOT get mined first because transaction nonces must be larger than the previous nonces by exactly 1. The latest transaction will stick around for a while, and if the cheaper pending transaction succeeds, it will eventually be processed.

Rescuing Stuck Transactions

A common use case of manually setting nonce value is when you submit a transaction, and it gets stuck.

For example, when the game CryptoKitties came out, transaction prices increased a lot. Transactions with standard gas prices would not get picked up by miners, as they were busy processing higher-paid transactions.

So in order to fix a stuck transaction with a low gas price, you could submit the same transaction again (with the same nonce), EXCEPT this time with a higher gas price. It would be processed, and the duplicate transaction would eventually shimmer out of existance, as no miners would include it in any blocks due to the now-invalid nonce.

Raw vs Regular Transactions

Finally, after determining all these other parameters, you must figure out how you are going to sign and send your transaction.

There are two RPC methods for sending transactions:

Every transaction in Ethereum must be signed by the account's corresponding private key. These methods differ in how the transaction is signed.

In case of the former method, the private key resides on the Ethereum node that is exposing the RPC API. We request a transaction from a specific account, and if the account is unlocked on the node, the node will create and sign the transaction for us.

This is a good use case for a node which is not publicly accessible. If you are creating e.g. a web app, which communicates with Ethereum, then instead of the application having to managing private keys, it just has to know which node to connect to, and which account to send from.

However, this obviously does not work well with public nodes. Public nodes allow calling APIs but they do not generate or manage private keys for users. This is where the latter RPC method comes in - it allows a transaction to be prepared outside an Ethereum node, and to be signed by any valid private key.

Which of these methods will work with a public node, and which one will not?

The eth_sendTransaction method will fail on public nodes, because exposing unlocked accounts to the public is useless. You could not hold any ether balance there, as it would simply be stolen. Infura nodes will return the error 405 Method Not Allowed.

A very important use case for offline transactions is to allow air-gapped Ethereum wallets. A wallet can be installed on a permanently-offline computer, which is used to prepare transactions. These transactions can then be transferred to an online computer (e.g. via USB), and submitted as-is. This reduces the risk of malware stealing your private keys a lot.
See Using MyEtherWallet For Cold Storage for a practical example of setting up such an air-gapped environment.

Since we are using a public node, we will need to prepare a raw (offline) transaction:

And voila! We can query our balance, scan the network (continuously) for transactions, and send transactions. The hard (and fun) part now is to make the wallet safe and user-friendly.

These parts are, unfortunately, out of scope for this article.

Issues When Running on Mobile Devices

Circling back a bit to Jaxx now. By this point it seems clear why they would sometimes show the wrong balance, or show an incorrect transaction history.

There are many moving parts involved here, and it is certainly understandable when something goes amiss. In the case of Jaxx, they have even more moving parts - they have their own web backend, which must communicate with Ethereum nodes on your behalf.

So there are plenty of places where small errors can creep in.

And it makes sense that mobile apps do not communicate with Ethereum nodes directly - after all, to show valid transaction information, it is necessary to continuously monitor the network. This is impractical on Android, and on iOS— since there are no APIs for continuous background processing — downright impossible.

Or is it?

Is an iOS Ethereum Wallet With no Dependencies Possible?

New blocks on the main Ethereum network are generated (at the time of writing) at the rate of ~15 sec/block.

This is similar on Rinkeby, where it is exactly 15 sec/block, so I assume that tests against Rinkeby would play out similarly against the main network nodes.

Using a naive wallet implementation against the Infura Rinkeby node, I needed around 80 seconds of synchronisation time (in the wallet) per 1 hour's worth of new blocks.

While iOS does not permit continuous background processing, it offers a background fetch API for the periodic download of new content. Exactly what we need.

Playing around with background fetch, it appears an application can be woken up multiple times every hour (during good conditions). Each time it is woken up, the application is given 10 to 30 seconds to perform its work.

During this time we could initiate a background task, and get up to 3 minutes of processing time. If we could start a background task once every 2 hours, it should be enough to keep us more or less up to date.

This is, of course, not deterministic. If iOS decides it's a bad time for doing background processing, then your app will not get a chance to synchronize in the background, and will fall further and further behind.

However, even if background fetch can not guarantee that our wallet will always be fully synchronized, it seems like it can reduce the amount of time we need to wait for synchronization, when we open it and want to use it.

So we can not rule out that this could be doable. Though — to see if it's feasible — we would have to actually write a wallet for iOS, and give this approach a try.

I haven’t actually tried monitoring blocks on iOS using the background fetch and background task APIs. There could be issues with this approach I haven't considered. Likewise, there could be different solutions.

Notes on Optimization

It is a bit of a pain having to synchronize with the blockchain every time you want to use the wallet, even if you're not using a mobile device.

A common optimization — as mentioned before in the case with Jaxx— is to have a serverside component, which would continuously monitor the state of the blockchain, your balance and transactions.

This approach has the added benefit of being able to run your own Ethereum node. It would be a lot faster because:

  • a local Ethereum node - accessible only by your application - could be made a lot faster than any public node;
  • local inter-process communication (IPC) would not have take place over HTTP;
  • information could be batched for the client application. Instead of the client doing a new HTTP request for every block, the server could return all of the required data in one request.

This approach has the drawbacks that your wallet becomes dependent on a server component, and — personally — this feels wrong for a cryptocurrency wallet.

For a truly standalone wallet, you need to access the network directly, without a proprietary server-side component. Unfortunately, there are not many Ethereum nodes that allow public RPC access.

Besides — relying on public Ethereum nodes is still a dependency (albeit a more palatable one).

Is a Mobile Wallet With No Dependencies Possible?

Of course. Just like desktop wallets with no dependencies, a mobile wallet with no dependencies would just be a node in the network.

It would have to connect to the Ethereum boot nodes, get a list of peers, and communicate with them without relying on RPC.

It would probably ruin a phone's battery though.

Demo Project on GitHub

As promised, here is the barebones demo wallet I came up with while exploring this topic:

Have fun poking around in it. I hope it can be at least interesting, if not useful.

The End!