How to hack a vanity address generated with Profanity

Yurii Rebryk
6 min readOct 5, 2022

--

I sent a message to Alex’s new address from his hacked vanity address.

On September 15, 2022, I decided to generate a vanity address for my daily-use hot wallet. Something easily recognizable, like 0x0000…aced. A friend of mine, Alex, told me that he used Profanity, an Ethereum vanity address tool, to generate his one. When I opened their GitHub repo, I found a README file that says that a critical bug was found. Moreover, on the same day, an article was posted on 1inch’s blog with some ideas on exploiting the bug.

I had some free time, so I decided to dive deep into the topic and hack it by myself. In this blog post, I’ll explain how addresses generated with Profanity can be hacked and how I was able to brute-force my friend’s private key on my MacBook M1 Pro (16 Gb) in 26 minutes. Let’s go!

How to get a public key?

If you have your private key k (256-bit long random number), you need to apply some elliptic curve cryptography to derive your public key. More specifically, your public key K = k * G, where G is a generator point on the elliptic curve. Then you take 5 least significant bytes of keccak256 hash of the public key to attain the public address. As you can guess, you cannot reverse that process. So the only way to get a beautiful address with many leading zeros is to try many different private keys.

How does profanity work?

Profanity uses OpenCL kernels to run many GPU threads in parallel that generate millions of addresses per second using a simple algorithm:

  1. Profanity generates a random seed S that is used to sample a private key k.
  2. Then it creates about 4 million work items, where the i-th worker takes K[i, 0] = (k + i * 2¹⁹²) * G as the initial public key.
  3. After that, each worker increases its public key by G until the address derived from it meets the user’s requirements. When it is found, you can easily calculate the private key given the number of the worker that found it and how many iterations it did.
Profanity generation process.

So what is wrong with that?

The method described is simple and effective, but there is a small issue in its implementation that has significant implications. The random seed S that is used to generate a private key k is a 32-bit number instead of 64-bit. Notice that if you know S, that Profanity used during the generation process, you know all private keys that it generated.

We are going to exploit the fact that a set of potential seeds is only 2³².

Small bug to hack them all.

How to exploit the bug?

Algorithm Idea

If you have a public address that was generated using Profanity, you can find the corresponding private key by reversing the generation process:

  1. First, you need to restore the corresponding public key K_t from the address. You can do this if there is any transaction signed by the victim using their key. For example, you can easily look it up on Etherscan.
  2. If the public key K_t was generated given some initial public key K by adding the needed amount of G (+2¹⁹² * G for each row and +G for each column), you can initialize your workers in a way that the i-th worker starts with a public key K_t - i * 2¹⁹² * G, and then decrease it by G on each iteration until K is finally found by any of them.
  3. If you know what private key corresponds to K, you can calculate the target private key (again, because you know the row and the column where it was found).
Reverse process.

It looks simple! There is one challenge, though. We don’t know the initial public key.

Vulnerability

Yes, we don’t know what initial public key K was used during the generation. But we know that it was derived from one of the 2³² seeds (about 4 billion). Thus, we can iterate through all of them and precompute all corresponding private and public keys.

Now, when we run the generation process backward, we need to check on each iteration whether a new public key belongs to our precomputed set. If we find any key that is in the set, we are done!

If the seed is a 64-bit random number, we need to precompute 2⁶⁴ public keys. That’s a lot of keys! 4 billion times more!

Search Problem

This part of the algorithm is the trickiest one.

Once again, you need a way to check whether an element belongs to a set of 2³² public keys. Ideally, you want each worker to do it on GPU, so your CPU is not a bottleneck.

There are 2³² public keys. It indeed doesn’t look like a huge amount for modern computers. But each public key takes 32 bytes in compressed form. So if you decide to store 2³² of them on the disk, it will require 128GB of memory. Moreover, it is not likely that your computer has 128Gb of RAM to upload them into the memory.

Put simply, 128GB is too much. If we use only 12 bytes for each public key, it will require 48GB to store them, and the probability of collisions will be low. When I used only 8 bytes, collisions did occur.

Partitioning
If you have a GPU with at least 48GB of memory, you can upload all keys into the memory, so each worker can search them using a binary search or utilizing some other data structure.

I wanted to hack Alex using my MacBook Pro M1 with 16GB of memory, so I chose 8GM as a memory limit and tested a few approaches.

Binary Search
I divided the set of all public keys into 4 parts of 2³⁰ public keys each. Then I took only 8 bytes of each public key, sorted them, and uploaded into the memory. That’s exactly 8GB of GPU memory.

Now each GPU worker can run a binary search to check whether its public key belongs to the set.

As you can remember, I already told you that collisions occur when only 8 bytes are used. But it is not a big deal because they are rare, and when they happen, I check those public keys on the CPU using 12 bytes.

Hash Table
Binary search is a great algorithm, but is there a way to search for an element even faster? Sure. We can use a hash table instead. However, there is one caveat. You cannot use all slots allocated for it. Generally, typical load factors with most open addressing methods are 50%. Thus, if we have 8GB allocated for the hash table, we can only store 2²⁹ public keys (8 bytes each). So we need to split our set into 8 parts instead of 4.

Is it worth it? On the one hand, when we have more parts, we are wasting more computational resources iterating through the same keys repeatedly. On the other hand, the hash table requires O(1) lookups instead of O(log N) for binary search.

Comparison
It took 64 minutes to complete 10k iteratinons using the hash table, and almost 4 hours using a binary search. Even with extra work wasted on 8 parts instead of 4, the former approach performs much better.

Hack

The fun part! It took me 7.5 hours to precompute the public keys using my Macbook Pro, and 26 minutes to hack Alex’s private key. That’s insane!

Profanity is almost 5 years old, and that vulnerability was there waiting to be discovered. I wonder how many more small bugs and typos are hidden in the crypto world that can have significant implications.

What to say? Be careful, always test your code and subscribe to my Twitter account😉

Disclaimer

This post has been prepared solely for educational and entertainment purposes.

--

--

Yurii Rebryk

Founder at Fluently. Chasing unicorns and sharing learnings along the way.