Bitcoin

BIP-39 basics. From randomness to mnemonic words.

Have you ever wondered how your Bitcoin wallet seed words (mnemonic words) guard access to your wallet funds and what makes such setup secure ? In this article we’ll dive into the basics of BIP-39 which describe what seed words are and how we can use them to backup our wallet keys in a recoverable way. Let’s go!

In the early days of Bitcoin there were no mnemonic words or easy ways to backup your wallet keys. The default wallet implementation would randomly create private keys which were stored inside a wallet file and you were responsible for backing up that file frequently (by default every 100 transactions). It wasn’t ideal from a user experience point of view to say the least. People would often times forget to backup their wallet file or they did back it up but then the backup would get lost because it was stored on electronic devices which would fail sooner or later. This resulted in a lots of Bitcoins being lost forever.

With time Bitcoin developers came up with better ways to backup private keys. One of the ideas proposed by Pieter Wuille in 2012 in BIP-32 and later widely adopted was called “Hierarchical Deterministic Wallets”. It described a way to generate an unlimited number of private and public keys in a deterministic fashion such that given the same seed (a random list of bytes of certain length) the same list of keys would be generated. This solved the problem of having to backup a list of private keys every 100 transactions. Having a backup of a single seed was enough to take care of the backup of all the private/public keys one would ever need.

Another idea that further simplified the backup of the wallet seed was proposed in BIP-39 by Marek Palatinus, Pavol Rusnak, Aaron Voisine and Sean Bowe. BIP-39 described a method to encode a random list of bytes (a seed) as an easy to remember/write down list of words. Compared to raw binary or hexadecimal representations of the seed (which still required electronic devices to store it) having a human-readable representation enabled much better handling of the seed by humans. From this point forward the seed could be written on paper or spoken over telephone and this opened up new, physical ways of backing up the seed (multiple paper copies in different locations, durable copies on metal plates with extra protection from fire/flood etc.).

In this article we’ll dive into the step-by-step process of transforming a random list of bytes (entropy) into a mnemonic sequence of words according to BIP-39 specification.

Step 1 – Entropy

First we need a good source of randomness. We can flip a coin or roll a dice. If we use a computer (or a hardware wallet) it has a built in random number generator which can act as a source of randomness. To keep things simple we’re going to flip a coin. BIP-32 specifies the entropy length to be between 128 and 256 bits and a multiple of 32 bits. Each coin flip is 1 bit of entropy. We want to have a 24-word seed so let’s toss the coin 256 times and write heads as “0” and tails as “1”.

00110010100001010111110100001011111111111010000010010000010010101101
00010101111001001011000100111100011110001001111011110111011010010100
11001100111011100110001011101101001010110101001111010010011010111111
0001100101011001000110100010000110110001100101110001

The following table describes the relation between the initial entropy length (ENT), the checksum length (CS) and the length of the generated mnemonic sentence (MS) in words. So if we wanted to have 12-word seed we’d generate a 128-bit entropy.

CS = ENT / 32
MS = (ENT + CS) / 11

|  ENT  | CS | ENT+CS |  MS  |
+-------+----+--------+------+
|  128  |  4 |   132  |  12  |
|  160  |  5 |   165  |  15  |
|  192  |  6 |   198  |  18  |
|  224  |  7 |   231  |  21  |
|  256  |  8 |   264  |  24  |

Step 2 – Split entropy into groups

Next we split the entropy binary into groups and end up with 23 groups each 11-bit long and 24th group having just 3 leftover bits:

00110010100 00101011111 01000010111 11111111010 00001001000
00100101011 01000101011 11001001011 00010011110 00111100010
01111011110 11101101001 01001100110 01110111001 10001011101
10100101011 01010011110 10010011010 11111100011 00101011001
00011010001 00001101100 01100101110 001

Step 3 – Encode

Each group (except for the last 24th group which only has 3 bits) contains a 11-bit number (0-2047 in decimal) and this number describes an index into a BIP-39 wordlist.

The first binary number is 00110010100. This binary number converted to decimal is 404. We can convert the binary sequence above into a decimal sequence (you can use a calculator, a web tool or do it by hand on paper if you have time).

404 351 535 2042 72
299 555 1611 158 482
990 1897 614 953 1117
1323 670 1178 2019 345
209 108 814

The words in the wordlist are 0-indexed meaning you start counting from 0. The number 404 corresponds to the word “crater”. Converting the full list of decimal numbers to words gives us:

crater cloud drill young animal
century earth siren because detail
knock unfold error jaguar merry
pistol fatigue nation wise clinic
boss assault grape

Step 4 – Checksum (24th word)

The last step is to calculate a checksum. The purpose of a checksum is to quickly verify if the list of words is correct (valid) or not. It can detect errors like using a wrong word, missing a word or having it in the wrong position.

To calculate the checksum we take all the 256 entropy bits we started with in Step 1 and calculate a SHA256 digest from them.

$ echo 0011001010000101011111010000101111111111101000001001000001001010110100010101111001001011000100111100011110001001111011110111011010010100110011001110111001100010111011010010101101010011110100100110101111110001100101011001000110100010000110110001100101110001 | shasum -a 256 -0

f3f06d74b794b20645460aa0b17d4e7a77eaaea283ee55344adbfcece4a63432
NB: shasum calculates the SHA digest from the input. Option -a 256 means using the 256 algorithm, -0 means reading the input in BITS mode where each ASCII '0' is interpreted as 0-bit and ASCII '1' is interpreted as 1-bit.

This is a number in hexadecimal format. We need the first 8 leftmost bits (1 byte) from this hash. We can use this online hex to binary converter:

f3 (hex) -> 1111 0011 (binary) 

Next we add these bits to the 3 leftover bits from 24th group from Step 2 and end up with:

00111110011

This is the last word (499 – dinosaur)

The full 24-word list is now:

crater cloud drill young animal
century earth siren because detail
knock unfold error jaguar merry
pistol fatigue nation wise clinic
boss assault grape dinosaur

We can verify that this is indeed a correct BIP-39 seed using this excellent BIP-39 tool created by Ian Coleman.

Step 5 – The seed

The last step of BIP-39 is creating the actual binary seed which is then used as a master key in BIP-32 deterministic wallet or using other methods. We’re not going to dive into the details of what this step involves but only quote from the BIP-39 spec:

To create a binary seed from the mnemonic, we use the PBKDF2 function with a mnemonic sentence (in UTF-8 NFKD) used as the password and the string “mnemonic” + passphrase (again in UTF-8 NFKD) used as the salt. The iteration count is set to 2048 and HMAC-SHA512 is used as the pseudo-random function. The length of the derived key is 512 bits (= 64 bytes).

You can read more about the PBKDF2 function in the context of cracking the passphrase here.

BONUS:

If you followed the steps above now you should be able to create and verify the correctness of BIP-39 seeds yourself (with minimal assistance from tools like binary to hex to decimal converters).

In the video below you can watch the seed stamping process where I punch all the 24 words onto 4mm thick stainless steel Coldbit Steel plate using a 1.5kg hammer and a A-Z letter stamping set:

You should be able to check if the last word of the seed in the video above is correct or not. You have to assume the first 256 bits of entropy are correct and calculate the missing 8 bits. The first person who comments on this post with a correct answer (the last word) and a little bit of description on how they did it can receive 1x Coldbit Steel + 1x Coldbit Passphrase + a Stamping Set + 1.5kg (3 lbs) hammer for free and create a long lasting, corrosion and fire resistant backup of their BIP-39 seed.

16 Comments

Leave a Reply to James Cancel reply

Your email address will not be published.

No products in the cart.