In a recent post about BIP-39 I described how mnemonic sentences in the context of Bitcoin work and what makes them secure. 128 or 256 bit cryptography is considered secure and unbreakable now and for the foreseeable future. It’s the basis on which Bitcoin wallet security depends on. Can the same thing be said about the optional BIP-39 passphrase, the so called 25th word ? Let’s find out.
Here’s a typical scenario: You have your 24 seed words backed up somewhere, ideally stamped on a piece of fireproof, acid-proof and pressure-proof stainless steel. You then decide to enable BIP-39 passphrase support in your hardware wallet and stamp the passphrase(s) on a hexagonal stainless steel rod and store the rod in a separate geographical location to your 24-word seed backup in order to:
- make it more difficult to steal your funds in case your seed words get compromised
- have multiple accounts in your wallet (each passphrase corresponding to a separate account)
- pass on the Bitcoins to the relatives after your death
Then I stumble upon your BIP-39 mnemonic backup stashed in your office drawer, import it into a new Electrum wallet, check the wallet balance but none of the addresses have ever been used. I know you have some Bitcoins though because I also happen to know your Twitter handle, check your tweets and see many references to “HODL” and “TO THE MOON”. I assume you must be using a BIP-39 passphrase.
The question then arises: “How easy is it to crack this extra passphrase?” and “What makes a good, hard to crack passphrase?“
While looking for answers I did some research and found some answers although they seemed a bit vague:
- Here’s a page from Trezor’s wiki describing multi-passphrase encryption, the $5 wrench attack and example passphrases: “lonelypumpkins” and “funnyspirit“. – how good are such passphrases (2 English dictionary words)?
- Here’s Andreas Antonopoulos in Sep 2018 saying that “a relatively complex passphrase will keep you safe for several weeks to a month. A very strong, complex passphrase will keep you safe for months.“
- By the way Andreas’ recommended word-length for pass-phrases changed from:
- 8–10 words, in July ‘18
- 6–8 words, in Sept ‘18
- 4–6 words, in Feb ‘19
What does it mean a complex passphrase exactly? We can do some research ourselves by looking at the BIP-39 spec but this time we start where we finished last time:
The last step of BIP-39 is creating the actual binary seed which is then used as a master key in BIP-32 deterministic wallet or using other methods.
To create a binary seed from the mnemonic, we use the PBKDF2 function with a mnemonic sentence (in UTF-8 NFKD) used as the password and the string “mnemonic” + passphrase (again in UTF-8 NFKD) used as the salt. The iteration count is set to 2048 and HMAC-SHA512 is used as the pseudo-random function. The length of the derived key is 512 bits (= 64 bytes).
The passphrase is used in the final step of the BIP-39 key derivation process. It uses Password-Based Key Derivation Function 2 to turn the mnemonic sentence (the 12 or 24 seed words) + an optional passphrase into a binary seed. Below is a quote from Wikipedia describing the input parameters and the operation of this function (scroll down for a simpler explanation of PBKDF2):
PBKDF2 applies a pseudorandom function, such as hash-based message authentication code (HMAC), to the input password or passphrase along with a salt value and repeats the process many times to produce a derived key, which can then be used as a cryptographic key in subsequent operations. The added computational work makes password cracking much more difficult, and is known as key stretching.
Having a salt added to the password reduces the ability to use precomputed hashes (rainbow tables) for attacks, and means that multiple passwords have to be tested individually, not all at once. The standard recommends a salt length of at least 64 bits.[6] The US National Institute of Standards and Technology recommends a salt length of 128 bits.
The PBKDF2 key derivation function has five input parameters:
DK = PBKDF2(PRF, Password, Salt, c, dkLen)
where:
- PRF is a pseudorandom function of two parameters with output length hLen (HMAC-SHA512 in BIP-39)
- Password is the master password from which a derived key is generated (mnemonic words)
- Salt is a sequence of bits, known as a cryptographic salt (BIP-39 passphrase)
- c is the number of iterations desired (2048 iterations in BIP-39)
- dkLen is the desired bit-length of the derived key (512 bits)
- DK is the generated derived key
A simpler explanation of PBKDF2 is this: It’s a cryptographic function designed to turn some password and an optional passphrase (salt) into a cryptographic hash. This function is designed to run slowly (think 2048 times slower than a single invocation of HMAC-SHA512 in the context of Bitcoin). In the case of BIP-39 final key derivation this is the gist of what PBKDF2 does: it runs HMAC-SHA512 2048 times. This is the only thing that we need to keep in mind for now.
BIP-32 (Hierarchical deterministic wallet)
Once we run PBKDF2 on our mnemonic sentence + a passphrase we end up with a key. This key is then used as a seed to BIP-32 private/public key generation. I’m not going to dive into the details of BIP-32 here, suffice to say that the cost of generating private/public keys and associated addresses and checking them against a database of addresses ever used is negligible compared to the PBKDF2 function above.
Classes of attack
Let’s define a few classes of attack. Each class represents the level of attack on BIP-39 passphrase and requires more and more expensive hardware.
Class A (10,000 hashes/s): My laptop can run about 500 PBKDF2-HMAC-SHA512 hashes per second on a single CPU core. Let’s assume I can run it on all 8 cores and get 4000 hashes/s and I have 2 laptops.
Class B (1,000,000 hashes/s): GeForce GTX 1080 can run approx. 240,000 hashes/s. We have a few gfx cards at our disposal for this kind of attack.
Class C (100,000,000 hashes/s): Hypothetically we can assume cracking BIP-39 passphrases will be profitable in the near future and some company will create a specialised ASIC just for this purpose. Let’s assume it will be 10x faster than the GFX card above so roughly 2,4M hashes/s. A typical attacker will own a few such specialised devices.
Class D: (1,000,000,000 hashes/s): A supercluster of ASIC (like a mining pool but for cracking PBKDF2-HMAC-SHA512)
It’s worth mentioning that no devices of Class C or Class D exist today. It’s possible that such devices might appear one day though so we need to keep this in mind when deciding about the length of our passphrase we want to use.
BIP-39 passphrase cracking times
Let’s consider a few typical dictionaries and charsets people use when generating passwords in general and BIP-39 passphrases in particular:
- BIP-39 wordlist (2048 word dictionary)
- EFF diceware long wordlist for use with five dice (7776 words)
- EFF diceware short wordlist for use with four dice (1296 words)
- Numerals (0-9) – 10 characters
- Alphabet (either upper or lower case letters) – 26 characters
- Alphanum (either upper or lower case letters + numbers) – 36 characters
Let’s bear the following in mind:
- An attacker doesn’t know what dictionary/charset you used for your passphrase. He/she may start with the easiest dictionaries/combinations (for example 1 or 2 word dictionary lookups) and then move on to more difficult strategies. The actual time it takes to crack your passphrase might be longer than the time given in the tables below.
- On the other hand on average it’ll take half the time given in the tables below for an attacker to obtain the correct passphrase
- A general formula for calculating passphrase entropy in bits is log2(no_of_combinations). So for example log2(2048^2) = 22 bits of entropy. Recollect that the PKCS standard recommends a salt length of at least 64 bits.
PBKDF2 weakness (optimisation)
Andrea Visconti, Simone Bossi, Hany Ragab an Alexandro Calò published a paper called “On the weakness of PBKDF2“. The gist of it is that there are a few optimisations in PBKDF2 that can be implemented and the whole scheme gets optimised. The authors mention 50% optimisation in the first case. So we can bear this in mind when reading the cracking times below. (Eg. We can assume the actual optimisation will in fact be 50% in which case instead of 2 milleniums the cracking time will be 1 millenium etc.).
BIP-39 wordlist (2048 words)
BIP-39 wordlist | abandon,ability,…,zoo | |||||
Password | Class of Attack | |||||
Words | Combinations | Entropy | Class A | Class B | Class C | Class D |
2 | 2048^2 | 22 Bits | 7 Mins | 4 Secs | Instant | Instant |
3 | 2048^3 | 33 Bits | 10 Days | 2 Hours | 85 Secs | 8 Secs |
4 | 2048^4 | 44 Bits | 55 Years | 203 Days | 2 Days | 4 Hours |
5 | 2048^5 | 55 Bits | 114 Milleniums | 1 Millenium | 11 Years | 1.5 Years |
6 | 2048^6 | 66 Bits | ∞ | ∞ | ∞ | 2 Milleniums |
7 | 2048^7 | 77 Bits | ∞ | ∞ | ∞ | ∞ |
Long Diceware Wordlist (7776 words)
Diceware wordlist | abacus,…,zoom | |||||
Password | Class of Attack | |||||
Words | Combinations | Entropy | Class A | Class B | Class C | Class D |
2 | 7776^2 | 26 Bits | 100 Mins | 1 Minute | Instant | Instant |
3 | 7776^3 | 39 Bits | 1.5 Years | 130 Hours | 1.5 Hours | 7 Mins |
4 | 7776^4 | 52 Bits | 11 Milleniums | 115 Years | 1.5 Years | 42 Days |
5 | 7776^5 | 65 Bits | ∞ | ∞ | 9 Milleniums | 901 Years |
6 | 7776^6 | 76 Bits | ∞ | ∞ | ∞ | ∞ |
7 | 7776^7 | 90 Bits | ∞ | ∞ | ∞ | ∞ |
Short Diceware Wordlist (1296 words)
Diceware wordlist | acid,…,zoom | |||||
Password | Class of Attack | |||||
Words | Combinations | Entropy | Class A | Class B | Class C | Class D |
2 | 1296^2 | 21 Bits | 3 Mins | 1 Sec | Instant | Instant |
3 | 1296^3 | 31 Bits | 60 Hours | 36 Mins | 21 Secs | Instant |
4 | 1296^4 | 41 Bits | 3 Milleniums | 32 Days | 8 Hours | 47 Mins |
5 | 1296^5 | 52 Bits | ∞ | 4 Milleniums | 1.5 Years | 42 Days |
6 | 1296^6 | 62 Bits | ∞ | ∞ | 1.5 Milleniums | 150 Years |
7 | 1296^7 | 72 Bits | ∞ | ∞ | ∞ | ∞ |
Numerals
Numerals | 0123456789 | |||||
Password | Class of Attack | |||||
Length | Combinations | Entropy | Class A | Class B | Class C | Class D |
2 | 100 | 7 Bits | Instant | Instant | Instant | Instant |
3 | 1000 | 10 Bits | Instant | Instant | Instant | Instant |
4 | 10,000 | 13 Bits | Instant | Instant | Instant | Instant |
5 | 100,000 | 17 Bits | 10 Secs | Instant | Instant | Instant |
6 | 1 Million | 20 Bits | 2 Mins | 1 Sec | Instant | Instant |
7 | 10 Million | 23 Bits | 20 Mins | 10 Sec | Instant | Instant |
8 | 100 Million | 27 Bits | 3.5 Hours | 1.5 Mins | Instant | Instant |
10 | 10 Billion | 33 Bits | 15 Days | 3.5 Hours | 2 Mins | 1 Sec |
12 | 1 Trillion | 40 Bits | 3 Years | 11 Days | 3 Hours | 17 Mins |
20 | 10^20 | 66 Bits | ∞ | ∞ | ∞ | 3 Milleniums |
Alphabet – 26 characters
Upper Case Alpha | ABCDEFGHIJKLMNOPQRSTUVWXYZ | Lower Case Alpha | abcdefghijklmnopqrstuvwxyz | |||
Password | Class of Attack | |||||
Length | Combinations | Entropy | Class A | Class B | Class C | Class D |
2 | 676 | 9 Bits | Instant | Instant | Instant | Instant |
3 | 17,576 | 14 Bits | 2 Secs | Instant | Instant | Instant |
4 | 456,976 | 19 Bits | 46 Secs | Instant | Instant | Instant |
5 | 11.8 Million | 23 Bits | 20 Mins | 12 Secs | Instant | Instant |
6 | 308.9 Million | 28 Bits | 8.5 Hours | 5 Mins | 3 Secs | Instant |
7 | 8 Billion | 33 Bits | 9 Days | 2 Hours | 1 Mins | 8 Secs |
8 | 200 Billion | 38 Bits | 242 Days | 2.5 Days | 35 Mins | 3.5 Mins |
9 | 5.4 Trillion | 42 Bits | 17 Years | 63 Days | 15 Hours | 1.5 Hours |
10 | 141 Trillion | 47 Bits | 447 Years | 4.5 Years | 16 Days | 39 Hours |
12 | 95 Quadrillion | 56 Bits | 302 Milleniums | 3 Milleniums | 30 Years | 3 Years |
15 | 1.6 Sextillion | 70 Bits | ∞ | ∞ | 531 Years | 53 Years |
20 | 19.9 Octillion | 94 Bits | ∞ | ∞ | ∞ | ∞ |
Alphanum – 36 characters
Upper Case Alpha | ABCDEFGHIJKLMNOPQRSTUVWXYZ | or Lower Case Alpha | abcdefghijklmnopqrstuvwxyz | and Numerals | 0123456789 | |
Password | Class of Attack | |||||
Length | Combinations | Entropy | Class A | Class B | Class C | Class D |
2 | 1,296 | 10 Bits | Instant | Instant | Instant | Instant |
4 | 1.6 Million | 21 Bits | 2.5 Mins | 1.5 Secs | Instant | Instant |
6 | 2 Billion | 31 Bits | 2.5 Days | 36 Mins | 21 Secs | Instant |
8 | 36^8 | 41 Bits | 9 Years | 32 Days | 7 Hours | 47 Mins |
10 | 36^10 | 52 Bits | 11 Milleniums | 115 Years | 423 Days | 42 Days |
12 | 36^12 | 62 Bits | ∞ | 150 Milleniums | 1500 Years | 150 Years |
15 | 36^15 | 77 Bits | ∞ | ∞ | ∞ | ∞ |
20 | 36^20 | 103 Bits | ∞ | ∞ | ∞ | ∞ |
Is my passphrase SAFU ?
- If your passphrase is a 1 word from a dictionary (aka the 25th word) it is NOT SAFE. Change it to at least a 4 word passphrase as soon as possible
- lonelypumpkins is a better passphrase than hodl but it’s not a passphrase than can withstand a few hours of cracking on regular hardware (too short)
- For the Long Diceware dictionary – use at least 4 words
- For the Short Diceware dictionary – use at least 6 words
- If you want to use BIP-39 wordlist as your dictionary use at least 6 words.
- You don’t have to limit yourself to dictionary words. The more uncommon word you use the more difficult it is to crack and the harder to remember.
- Don’t make the common mistakes people make when coming up with a passwords. Don’t use your birthday, people’s names, pets’ names and favourite places as your passphrases.
- When in doubt roll the dice and pick 6 random words from the short diceware list.
- Don’t rely on your memory or paper or electronic devices to store your passphrases. Order Coldbit Passphrase coupled with Coldbit Steel to have a durable, fireproof, waterproof and pressure-proof backup of your seed words and passphrases.
BTCRecover
You can see a practical example of using BTCRecover to recover a forgotten password below. In these examples the public address is known and there is no need to look it up in the blockchain (which will slow the process down).
If you want to learn more about Bitcoin wallet/key security and best practices for safe storage sign up to our newsletter below:
How I checked over 1 trillion mnemonics in 30 hours to win a bitcoin
1 year later after publishing this article John Cantrell brute-forced the missing 4 words from a 12-word mnemonic and swept 1 BTC using a rented farm of powerful gfx cards. We estimated a Class C attack (a farm of powerful gfx cards or ASICs) to take ~ 2 days to crack 4 missing BIP-39 words. Very close to John’s result.
[…] You can read more about the PBKDF2 function in the context of cracking the passphrase here. […]
[…] https://coldbit.com/can-bip-39-passphrase-be-cracked/ via /r/Bitcoin https://www.reddit.com/r/Bitcoin/comments/bzncnp/what_does_it_take_to_crack_bip39_passphrase/?utm_source=ifttt […]
Great! work. Thank you. Here’s my question. How difficult would it be to crack a simple easy to remember portion of a password, appended to a known high entropy 40 character portion of a bip39 passphrase. Essentially creating a 3 Factor Authentication.
Why? Let say you want to leave some BTC to a loved one. But only after you’re gone. So you give them a copy of your recovery mnemonic. You then give a copy of your high entropy password to a trusted family or friend. Plus copies placed in your safety deposit box.
You tell your loved one that no one should ask them for their copy of your mnemonic recovery. And that a copy of your high entropy password is stored in your lock box. Plus a trusted friend is also holding a copy for you.
Then you add the secret sauce. You write a simple password that you and they can easily remember on a piece of paper. Then burn it. You tell them that they’re never to write that password down or share with anyone. You tell them that they will need all three factors to recover your Bitcoin.
Unfortunately somehow someone gets a copy of your mnemonic and you’re 40 character high entropy password. But only you and your loved one knows about the simple password appended to the 40 character high entropy portion of your password.
So with the mnemonic and the 40 character high-entropy password in hand, how hard will it be for a malicious attacker to crack the salt by cracking the appended memorable simple password. Does the complexity of the mnemonic plus the high entropy portion of the 40 character password significantly increase the complexity and difficulty of cracking the appended simple portion of the password alone. Thereby preventing the theft of their Bitcoins.
Asking for a friend..?
If the attacker knows that there’s an appended password to crack then it’ll be easy.
Same cracking times apply (the longer the password the better).
It’s as if there was no 40-char high-entropy password to begin with.
If the attacker doesn’t know about this scheme and doesn’t know about the second password then the whole scheme has a bit more security at the cost of complexity and reliance on you and your loved one’s good memory. Personally I wouldn’t rely on anyone’s memory.
I think proper key management is not yet a solved problem in Bitcoin. I also think this is the biggest obstacle on the road to mass adoption. We’ll see many more schemes coming up (as the valuation goes up) and many more sophisticated attacks. The existing techniques will get replaced by better ones (eg. Shamir Secret Sharing used by default in all HW wallets). Hopefully the free market economy will solve this problem (as it solved many more difficult problems).
“If the attacker knows that there’s an appended password to crack then it’ll be easy.”
I was afraid of that, thanks..?
Still, I think bip39 is excellent. Especially when combined with a HW, and Keepass/Keepass2Android. If properly configured and utilized.
It would be nice to have the option of an additional bip39 protocol with Argon2 + variable iterations/rounds. Ranging from difficult to Impossible. Based on the character length of the password or passphrase you choose.
I wouldn’t care if it took my laptop 30 minutes to unlock/restore my private keys from deep freeze. Essentially you would have three levels. Hot wallet, cold storage and Deep Freeze..?
I’ll try and suggest this to the developers and see what they think. Thank you for the very clear explanation..??
[…] você estiver interessado em saber mais sobre os requisitos de tempo, então há uma extensa análise por Coldbit. Para dar um exemplo, a Coldbit estima que se um atacante fosse projetar e construir um […]
This is fantastic information! I have a request – can you please remove the reference to the Bitcoin Inheritance Protocol blog entry? That writer has published some absolutely horrible security advice. That page doesn’t even describe a protocol of any kind – just some random “ideas”. It proposes the terrible idea of splitting your seed in half and giving out the halves – even tho this guy knows about shamir’s secret sharing algorithm! Its very irresponsible. Please remove reference to it.
This post was written when SLIP39 wasn’t ready yet. I agree seed naive seed splitting should be discouraged now.
Seed splitting should never have been encouraged. Regardless of SLIP39’s status, naive seed splitting has never been secure to do.
[…] improve your security vs a hacker with physical access, it’s a good idea to set a passphrase (5 or 6 word passphrase chosen from the BIP39 list will likely be strong […]
[…] Aquí dejo info acerca de la manera idónea de crear Passphrases. Y aquí un tutorial de cómo hacerlo. […]
This article needs an update now that this happened. https://medium.com/@johncantrell97/how-i-checked-over-1-trillion-mnemonics-in-30-hours-to-win-a-bitcoin-635fe051a752
Thanks for sending the link. Hats off to John Cantrell for brute-forcing the missing 4 mnemonic words and open sourcing his OpenCL work. There’s no better way to learn Bitcoin than trying to break it.
With regards to our article above – we estimated breaking 4 words from the BIP-39 wordlist using Class C attack (you can consider his 80 gfx cards to qualify to class C) to take ~ 2 days which is very close to how long it actually took. We should expect work like this to progress and more sophisticated mnemonic/passphrase breaking solutions to be developed. We believe the basic assumptions of this article are still correct.
Why? It took him 30 hours to crack 4 words which the article states would take 2 days with PBKDF2 (which is slower/more secure than the hashing algorithm normal Bitcoin mnemonic seeds use.) Everything said in this article is still true.
[…] Passphrases have historically been referred to as the “25th word”, but in reality if you want to make your passphrase more secure, you will need more than just 1 word. To make a passphrase more secure against a physical seed extraction attack (possible on Trezor wallets), you’d probably have to make it 6 or 7 words from the BIP39 word list. […]
Thanks for your suggestions! 🙂
Does the hacker know how long your passphrase is when trying to discover it?
Would a simple sentence suffice using either the words or first letter of each?
For example would a simple sentence, “On A Clear Day You Can See Forever” be sufficiently difficult as the 25th word – either the “oacdycsf” or the eight words?
Thanks. 🙂
Many users would not choose a randomly generated passphrase (based on a secure RNG). They will come up with the passphrase themselves or use a popular phrase (from lyrics, movies, etc.).
Such passphrase can be easily bruteforced (dictionary attack) since the key stretching with the “seed extension” (also known as “extension word” and “passphrase”) in the BIP39 specification is very weak (PBKDF2 using 2048 iterations of HMAC-SHA512).
I know about WarpWallet (from 2014), it’s better than almost nothing (PBKDF2 using 2048 iterations of HMAC-SHA512). But there is a new better solution – Doubleslow keystretcher (find it on Github).
Doubleslow keystretcher uses modern CPU and memory intensive key derivation functions (Argon2, Scrypt). It can be configured to make the bruteforcing impossible with modern hardware by using not very strong passphrases (i.e. if configured with 2GiB RAM and thousands of iterations one bruteforce iteration will take hours on a modern PC instead of microseconds).
A passphrase that is hard to bruteforce with PBKDF2 using 2048 iterations of HMAC-SHA512 will be impossible to bruteforce if Doubleslow keystretcher is being used for additional keystretching.
It’s hard to make an ASIC to beat the Doubleslow keystretcher bacause it is RAM intensive (based on Argon2 and Scrypt).