Early Monday morning (June 12, 2017), I saw a post on reddit titled “How to generate a big password list for a brute force attack for my Ethereum wallet?”. Sweet.
Edit: Since it looks like the guy deleted the text content of the post (a pretty smart move…), I’ll summarize: “Hey! I’ve locked myself out of my ethereum wallet, I know the password is something like abcdefg1234567- somebody help me!”
Long story short, it was a rare opportunity to put my “programmer whiteboard interview skills” to work. Spoiler- everything worked out great! Here’s a quick overview of what I did, and why I think it’s interesting.
(If you are currently reading this in a desperate search for a way to crack your own ethereum wallet, read on for some free tools I made that might be able to help you out. If that doesn’t work, shoot me an email at phildo211 with google’s email service.)
Alright. So, step one was to generate a text file containing all similar passwords to abcdefg1234567 (<- not the actual password). This was the easy part. There’s a concept called Levenshtein Distance that defines “closeness” of two strings by the number of “steps” it takes to get from one word to another (a “step” is either a character deletion, a character substitution, or a character insertion).
This is an example of a kind of problem you can find yourself thinking way too hard about, to the detriment of actually solving the problem. It’s literally a couple nested for loops: iterate over all characters of the base string, at each character iterate over all possible operations. printf each time. Done.
Implementation aside, Levenshtein distance has a predictable property: the larger the distance, the bigger the possibility space.
Quick Math Break: If n = number of characters in the base password, m = number of characters in the list of plausible character substitutions/insertions, then possible deletions = n, possible substitutions = n*m, and possible insertions = (n+1)*m. So each additional Levenshtein step multiplies the space by (n+(n*m)+((n+1)*m)) = really big. We don’t take into account redundancies, because that accounts negligibly toward the possibility space (there you go thinking too hard about the problem again!).
All you need to know about that is- hope you’re only a Levenshtein distance of 1, maybe 2, from the actual password. Or get a lot of really fast computers.
Great. So the next problem is to find a way to crack this thing. There are a couple approaches one could take:
- Notice that just about every ethereum utility (wallets, etc…) is open source. Find a repository and dig up the bits about verifying a password against a wallet.
Uh, no way. Those repositories are huge, mature, and battle-safe (at least, hopefully…). I’m bad enough as it is reading other people’s code- now I’m supposed to dig through layers of UI, security, data flow, cross platform nonsense, just to get at a single snippet of crypto? And I have to bring along every library/plugin/IDE expected by the repository to do it? We have to acknowledge we only need a one-off quick solution!
- See if anyone else has already solved exactly the problem I currently have.
This was _so close_ to working. We found this, which almost got us there. The problem is it used a v1 spec’d ethereum wallet. Mr. Reddit man had a v3. The purpose of trying to use someone else’s code is to spare me the burden of thoroughly understanding the problem space myself. If I’m going to need to hack apart his solution and implement my own cracking anyways, now I have _two_ burdens: understanding the problem space _and_ understanding this random person’s codebase. The only thing I’d get out of it is a free for loop that iterates through a text file. Not worth it.
- Really thoroughly understand the crypto, and just do it yourself.
I’m not going to lie- I tried this. And failed. There’s a _lot_ about crypto I just don’t yet grok. And given I was literally racing with other reddit users trying to solve this guy’s problem before me, I didn’t have time for a week long course or two.
- Understand the problem space well enough to know what you know, and what minimal steps you can take to cover what you don’t.
Bingo. I ended up downloading two python modules: multikdf, and hashlib. The spec for v3 ethereum wallets was simple enough to understand, so long as you’re willing to blackbox things like scrypt and sha256. The final result is here. (As a bonus, it turns out you only need to get halfway through the spec to get to “verification”, which means a huge gain on perf that likely wouldn’t have been found had I just plugged it in to an API that opens wallets.)
Woof. Ok. So now we have our one-off password generator, and our one-off wallet verifier. Time to plug in the base password and wallet details, and let ‘er rip!
In the ensuing 9 hours it would take for the program to run, I’m going to break to talk about an interesting detail in the underlying cryptography.
The wallet is specified to use a kdf (key derivation function) of it’s choice, but a common option is to use “scrypt”. Scrypt is an algorithm that, by design, takes a bit of time to run. That time is variable on a specified “n”- or “iterations through a part of the algorithm”. The reason it is designed to take a long time to run is so it’s hard to, well, do precisely the kind of thing we’re trying to do: brute force it.
N can be whatever the wallet-generator wants. You could set n such that it would take 5 minutes to try any possible password combination. The problem with that decision is two-fold: 1. you will have to wait 5 minutes every time you use your password legitimately, and 2. if you ever find yourself in a situation where you need to brute force your own password, you’d render it impossible. But hey, no bad guys are getting in either!
This is where the selection of an “n” gets interesting. We talked about how huge the possibility space of even a small Levenshtein distance gets. Well, that space is nonsensically huge when dropped to the possibility space of “any possible password” (m^n. A much simpler equation, but do not underestimate the power of exponential growth. lol.). So, where an “n” causing 1 second delays at each attempt isn’t enough to stop someone who has an idea of what your password likely is, it’s _way more_ than enough to stop someone who’s trying to brute force it from scratch.
We were fortunate that Mr. Reddit man had a low n- it took approximately 0.2s at each attempt (_way_ more than enough to stop a from-scratch brute-forcer). I’ve since been contacted by someone with the same problem, who has a much higher n. ~2s per attempt. Ouch does that make this more difficult. As of the time of this writing: the program has been running for ~30 hours, and isn’t close to finishing…
It’s this point that is making me reconsider just cranking my n to the max every time I’m setting crypto parameters. But who knows, maybe that’s a terrible idea. (Do Not Take an Internet Rando’s Advice On Cryptography ¯\_(ツ)_/¯)
So, to wrap things up, the program eventually stopped, claiming the lost password. I gave Mr. Reddit his password, he rewarded me with some ethereum from said wallet, and we all went our own separate ways, the world just a little bit brighter.
If you need to generate passwords of your own, you can start here:
If you’ve got a list of passwords, and want to try them against your wallet, you can try here:
(Yes, I’m aware I’ve committed wallet details to this repository. The password is “thisisatestwallet”. Knock yourselves out.)
Let me know if you want some help with your own password:
phildo211 at google’s mail service.
Give me eth because you’re a nice person:
Thanks for reading!
Update (1/15/18): I’ve since worked with many people, to varying success/failure in recovering their passwords. I’ve also refined my methods (including the ability to work with presale wallets!), and have a really nice cracking rig! Let me know if you’d be interested in my help. :)
Here’s my PGP key if you’re so inclined (apologies for medium’s lack of fixed-width font rendering):
— — -BEGIN PGP PUBLIC KEY BLOCK — — -
— — -END PGP PUBLIC KEY BLOCK — — -