A Technical Breakdown of MySky Seeds

David Vorick
The Sia Blog
Published in
8 min readMay 18, 2021

MySky uses seeds to authenticate users instead of a username and password construction. Seeds and passwords are very similar — the main goal of both is that they are difficult for an attacker to guess, and to provide a foundation for a user’s identity.

In the case of a typical password, the amount of randomness inside the password is left up to the user. The user chooses the phrase that they want as a password, and then the user and server both hope that the user was smart enough to pick a secure password. That password is then sent to the server, and the server has to be trusted to store the password safely.

In practice, it has been demonstrated that the average user cannot be trusted to create a secure password, and that the average server cannot be trusted to store the password securely. The standard advice from security experts is for users to use a completely different password for each service, and to have that password randomly generated by a computer.

We can do better by using seeds. A seed is a fixed amount of randomness, usually generated by a computer (though you can use techniques like coin flips to generate secure seeds as well). Enough randomness is used that no attacker could ever reasonably guess the seed, and then the seed is combined with cryptography to authenticate the user. A user can prove their identity to a server without ever sending the server their seed, which means the user can securely use the same seed for every website.

Security Thresholds

We call the guess-ability of a seed “entropy”. The amount of entropy in a seed is the logarithm of the number of guesses an attacker would have to make to be certain that they could guess your seed. A seed with a single bit of entropy takes at most 2 guesses to figure out. A seed with 8 bits of entropy takes up to 2⁸ guesses — 256 guesses — to get the right answer.

We only consider a password secure if it has so much entropy that no realistic attacker could ever possibly guess your password, even with extraordinary luck. In the cryptography community, 128 bits of entropy — requiring up to one billion billion billion billion guesses — is considered secure. Note that this is exponential; a 128 bit password is twice as secure as a 127 bit password and is more than two hundred million times as secure as a 100 bit password.

User generated passwords often struggle to get above 40 bits of entropy. The infamous password “correct horse battery staple” is only 44 bits of entropy, and even most passwords generated at random by utilities like LastPass have less than 80 bits of entropy. Cryptographic seeds on the other hand nearly always have 128 bits of entropy.

One Seed, Many Secrets

One seed can be securely used to create many independent identities. By using a hashing function, we can deterministically transform one seed into a completely new seed. We call the original seed “the root seed”, and we call the technique of hashing the seed with an identifier “salting”.

As an example, you can create a work identity by using the word “work” as a salt, and you can create a personal identity by using the word “personal” as a salt. Someone who looks at both your work identity and personal identity side by side will be unable to tell that these identities are derived from the same root seed — they enjoy full cryptographic independence, despite only requiring the user to keep one secure root seed.

And you can keep salting seeds. For example, you could salt your work seed with the words “client 1” and “client 2” to create two identities for dealing with different clients. These new seeds are associated only with your work seed, and allow you to for example share your work seed with your employer, who can then access all of your client identities but still has no access to your personal identity.

MySky makes heavy use of salting to keep the user’s data separated between applications, and also to give the user extra privacy when they want to create encrypted or hidden files.

Seeds as Entropy

MySky seeds are raw entropy. The seed phrase is not a set of words that we pass into a hashing function, but rather an encoding of the entropy itself. Going between the raw, full entropy of the seed and the seed phrase is a very simple process that can be done in just a few minutes by hand. There is no math, it’s just looking up words and copying down ones and zeros.

This is actually a departure from much of the cryptocurrency space. Most of the cryptocurrency space applies key stretching to the seed phrase before using it as entropy. This step both adds unnecessarily complications to the process, and also reduces the amount of interoperability that that seeds have with other systems, especially in anemic contexts like secure hardware wallets.

The purpose of key stretching is to make a password more secure, but seed phrases already have a full 128 bits of entropy, which provides more than enough security all by itself.

Seed Dictionaries

Seeds are an encoding of a random sequence of bits. A MySky seed is just a sequence of 128 random 1’s and 0’s, represented using a set of words. The size of your dictionary determines how many bits you can derive from each word. To know the exact number of bits you get per word, you use the binary logarithm of the dictionary size. For example, a 256 word dictionary will give you 8 bits of entropy per word, and a 1024 word dictionary will give you 10 bits of entropy per word.

You can use a dictionary of any size, for example 1000 words or 12345 words, however the math gets rather complicated unless your dictionary size is a power of 2. For that reason, we use 1024 words in the MySky dictionary — exactly 10 bits per word, which makes the process for converting between binary and a seed phrase very simple.

We could use a bigger dictionary, however the MySky software needs to load the entire dictionary when logging in for a user. Larger dictionaries mean longer load times, so we elected for 1024 words instead of 2048 or 4096.

In total, you need 13 entropy words to make a MySky seed. This gives us 130 bits total, though we actually only use the first 128 bits. For the final word — the 13th word, the last two bits are treated as a version number. As of now, all seeds are “version 1”, which means the final two bits are ‘0’. All MySky seeds only use the first 256 words of the dictionary for the 13th word of the seed phrase.

Seed Checksums

MySky seeds include 2 checksum words at the end — 20 bits of entropy which exist for no reason other than to ensure that a seed was copied down correctly. The checksum is optional, however it is recommended.

Seed phrases in the cryptocurrency space most commonly have only 4 bits of checksum. This is nearly worthless — is has a high probability (more than 5%) of failing to detect that a seed phrase is wrong, and if you do happen to realize that your seed phrase is incorrect, you have almost no ability to figure out where your mistake is.

When the checksum is 20 bits, you can reliably figure out which word is incorrect, and what the correct word is supposed to be. The Sia community has dozens of success stories where a user incorrectly copied one of the words in their seed phrase, but was able to use the checksum to recover their real seed phrase.

We keep the checksum optional so that people producing seed phrases by hand do not need to compute the checksum themselves to get a valid seed phrase.

Word Uniqueness

To minimize the chance that a seed is copied down incorrectly, the MySky dictionary ensures that every single word has a different first three characters. The software also only looks at the first three characters of a word when loading the seed phrase. So for example if the user writes down the word “babies” instead of the word “baby” when copying down the seed, the seed will still be correct because only the letters “bab” are important.

MySky Seed Specification

The MySky seed is 15 words. 13 of the words are used for entropy, and 2 of the words are used as a checksum. There is only one dictionary for MySky, which is an English wordlist, and can be found here. There are 1024 words in the dictionary, which means that each word encodes 10 bits. Only the first 3 letters of a word are considered when decoding a seed, which gives users some flexibility to tweak their seeds.

MySky seeds are 128 bits of entropy. The first 12 words each provide 10 bits of entropy, and the 13th word provides 8 bits of entropy. The last two bits of the final entropy word are reserved as version bits. As of writing, the only valid version is “version 1”, which means the bits must both be set to ‘0’. Therefore all valid MySky seeds today only use the first 256 words of the dictionary for the 13th word of the seed phrase.

The checksum is computed by taking the sha512 of the seed bits. Note, you take the sha512 of the encoded bits themselves, not of the words or seed phrase. The first 20 bits of the sha512 checksum are used as the 20 checksum bits, and get converted into the final 2 words of the seed phrase.

And that’s it. Though there’s a lot of deliberation that went into the exact design choices, simple is usually better, and the MySky seed specification is one of the simplest on the market.

Final Thoughts

In designing this seed specification, we looked at how the rest of the ecosystem constructs seeds, and we consulted numerous outside experts. And because MySky sits on top of a fully decentralized storage system, we actually get to take a few shortcuts.

For example, a common suggestion is to encode a birthday into the seed value. By encoding a birthday, you can often save yourself time because you know that you don’t need to look for events that happened earlier than the seed’s birthday. This is particularly useful for blockchains, where a seed recovery process otherwise involves scanning the entire blockchain. With MySky, we can store the birthday of the seed in the cloud, and avoid needing to keep it encoded into the seed itself.

The decision to make the seed phrase the raw entropy also allows us to have maximum flexibility in adding layers on top. For example, MySky does not do any key stretching itself, but a layer on top could use a larger dictionary and key stretching to reduce the number of words in the seed phrase. A seed provider with a 8192 word dictionary and 16 million iterations of a key stretching routine only needs 8 words to be secure. You can also use this layer on top to make a seed provider that is compatible with things like BIP39 or Metamask.

Overall, MySky seeds combine a large number of considerations to create a simple protocol that we believe substantially satisfies the needs of Skynet.

--

--