In examining what we have left to do for multisig support, it is best to learn what we have done so far, and where we started. Originally, Bitcoin started with Pay-to-Pubkey addresses.
Bitcoin uses a UTXO model - unspent transaction outputs. There are no accounts with balances, instead, there are discrete pieces of bitcoin (UTXOs), that are locked by certain conditions which specify how they must be spent. In fact, Bitcoin does not, at a low level, have any concept of addresses - only individual coins. No user can 'hold' any bitcoin, and the bitcoin are never 'stored' on a wallet or piece of paper - the wallets or paper merely hold the information necessary to satisfy the locking condition(s) required to spend the coins. It is the blockchain that stores the bitcoin. When a user attempts to spend a UTXO, nodes verify that the condition(s) are trully satisfied.
However, let's get a little less pedantic, for the objective of the post here.
One of the first types of addresses was Pay-to-Public-Key-Hash (P2PKH). User A wishes to send bitcoin to User B. User B hashes their public key (which is actually not public at the moment), and the hash is made public as the bitcoin address. User A sends his bitcoin to User B by sending the coins to this address. When User B wants to send to User C, User B must include in the transaction their public key (the hash of which needs to match the address that User A sent to), and a valid signature over the data. Click here if you don't understand this terminology.
The downside of this type of address, is that it only allows for a simple single-signature transaction. No timelocks, no multisignature, no extra conditions, etc.
The next major address type was Pay-to-Script-Hash (P2SH). This set out to fix the above issues. With this type of address, the receiver can generate a set of locking conditions, and the sender does not need to know these conditions.
How is this done? Well, User B generates a set of locking conditions, hashes them, and encodes them as an address. Then User A sends the coins to this address. When User B decides to spend, they must supply the full set of spending conditions (the hash of which must match the address previously sent to), and a signature.
This allows for multisignature wallets. For example, User B could set the spending conditions for a wallet to 3-of-5 signatures. Hash that and encode it as an address, and post the address to someone for payment. So cool! There is still a long way to go though.
P2SH was a soft fork that took effect on April 1st, 2012. The next step towards usable multisig was Hierarchical Deterministic (HD) Wallets. The full HD protocol was standardized on April 24th, 2014. This was partially implemented in Bitcoin Core v0.13.0, which was released on August 23rd, 2016, and fully implemented in v0.21.0 which was released on January 14th, 2021. The Bitcoin client used to use randomly generated keys to generate addresses. By default, 100 keys were cached. This meant that if you backed up your wallet immediately, then received bitcoins on addresses 1, 2, ... 99, 100, 101, 102, 103, etc, lost your wallet, and then recovered your wallet from the backup, you would not recover the bitcoin on any addresses received past 100. You had to continue to backup your wallet at least every 100 addresses.
HD wallets solved this by deriving all keys from a single starting point, called a seed. This wallet therefore does not need frequent backups, because a user only needs to backup the seed to restore all addresses derived from it. It also allowed users to calculate public keys (receiving addresses), without needing the private keys. This is incredibly important for multisignature accounts, where you usually want to store all the private keys offline on airgapped devices. For example, the traditional setup today is to have a bitcoin node online with your master public key. This node verifies transactions sent to you, and generates a new receiving address each time you need to receive payments for goods/services. Your node does this without access to private keys. You have (for example) 5 private keys that are distributed geographically around the world. When you go to spend funds, you need to access three of them (in this 3-of-5 example). Typically, public keys are generated from private keys. Without HD wallets, you would have to access those geographically distributed private keys everytime someone wanted to pay you (to generate a receiving address), in addition to every time you wanted to spend (to sign the funds)! It would be a huge waste of time, and also a security risk to access those keys so often.
Hardware Wallets were devices that came to market just a few months after the full HD Wallet protocol was standardized (Trezor was released in August 2014). They utilized HD wallets by deriving private keys on the airgapped device, and deriving public keys on their internet connected wallet/node. Trezor was the first Hardware Wallet released, and later Ledger Nano was released in late 2014 (taking a different approach with a propriety secure element). Ultimately, many Hardware Wallets would be released, all claiming to be secure by allowing the user to securely sign transactions from the device, while being plugged into an infected computer. This is not true, and HWW's should not be used. However, they are still widely implemented today due to their ease of use.
On August 16th, 2016, Jonas Schnelli posted to the bitcoin-dev mailing list a post titled 'Hardware Wallet Standard'. With multiple hardware wallets came multiple signers, and partially signed data coming and going between these signers. This resulted in vendors adding proprietary and non-standard plugins, all attempting to interact with each other and other coordinators. It was one /giant/ mess! Jonas proposed to create a BIP to standardize it all. (I actually believe that Pavol Rusnak discussed this issue even further back ~2014, but I could not find this post. If you are a bitcoin historian, please contact me!) This proposal would actually be lost in the mountain of work that the devs would take on, and wouldn't be re-surfaced until 2018, with developer Andrew Chow taking the reins. Being assigned the BIP number BIP 174, Partially Signed Bitcoin Transaction Format, PSBT standardized how unsigned or partially signed transactions can be passed around to multiple signers, even without needing the UTXO set (this is very important, because offline signers do not have access to the UTXO set). This is a crucial step in any multisignature wallet, and was implemented in Bitcoin Core v0.17.0, which was released on October 3rd 2018.
On August 24th, 2017, Segregated Witness (SegWit), was activated on the network. SegWit was a protocol upgrade that changed the structure of bitcoin transactions. It had many affects on many different parts of Bitcoin, but for the purpose of this post, we will discuss the direct effect it had on multisig. Before Segwit activated, multisig was completed via P2SH, which was secured by hashing the spending conditions first with SHA256, and then with RIPEMD-160. This means that it ultimately was secured with a 160 bit hash. This also means that to steal all of the bitcoin, a participant of the multisig only had to find a collision between a valid address of the multisig script and a script that paid the attacker all of the funds. This would only take on average 80 bits of work (half of 160 bits of protection) to steal. This is not enough protection, and Segwit increased this to 256 bits of security (128 bits on average to steal) in the new SegWit multisig script address type, Pay-to-Witness-Script-Hash (P2WSH).
Due to the new address formats created by Segwit, different vendors were creating different standards for deriving addresses from master public keys (Electrum had the xpub/ypub/zpub 'standard' for example, and websites had to be set up to keep track of all the different derivation paths used by various companies). It was a complete mess. This was both a layer violation (key generation and script type should not be made dependent on each other), and also was not scalable with future script types that would be created. Because of this, Bitcoin Core created Output Descriptors. In July 2018, Pieter Wuille wrote the first post on the output descriptor concept, and created the first pull request that built support for it in Bitcoin Core's RPC.
So what are descriptors? A descriptor is a human readable string that represents how the bitcoin are locked, and everything needed to unlock it. Most importantly, descriptors are explicit about the public keys, their derivation paths used, and the addresses that they are producing. This specificity solves the lack of standardness described previously.
To fully support descriptors, Bitcoin Core would have to start creating Descriptor Wallets. But the wallet needed to be redesigned for this. In the old model, everything centered around a bunch of keys. The wallet would take a key and convert it to a locking condition/address. But this resulted in one key having multiple addresses, depending on what you wanted to do with it.
Around April 2019, discussions began on how to best deal with this issue. The goal was to separate locking condition management, and key generation, signing, and address ownership logic into a separate 'box' called `ScriptPubKeyManager`, which would provide a standard interface for that logic. (ScriptPubKey is the name for the locking script). There is initially one ScriptPubKeyManager tracked in the wallet, called LegacyScriptPubKeyMan, for the old wallet process. Later, a new ScriptPubKeyManager called `DescriptorScriptPubKeyMan`, for the new descriptor wallets was coded. This new setup is called Wallet Boxes, and was merged into Bitcoin Core on April 26th, 2020.
Essentially, the wallet has now 'flipped' from taking a key and creating a locking condition, to taking a locking condition, and asking what is needed in order to sign for it. This backwards incompatible change was the perfect time to make another backwards incompatible change in the database backend of the wallet - Berkeley DB (BDB). The version of Berkely DB used in Bitcoin Core was more than 10 years old at the time because of concerns around consensus forks that would be introduced in the network if newer versions were used - this in fact already happened around August 16th, 2013 with Bitcoin 0.8. Due to differences in the number of Berkeley DB locks, this caused a fork that was fortunately solved rather quickly thanks to the early age of the network, the still existing alert system, and the known, centralized mining pool owners. However, it did cause a reorg of 24 blocks and a doublespend. No one wanted this to happen a second time.
With Descriptor Wallets, the backend was replaced with SQLite due to its thorough testing, wide use, and improved compatibility. SQLite makes very few format changes over time - for example, although the descriptor wallets use the newest version of SQLite, Bitcoin Core is still compatible with a version of SQLite from 2013 (exactly unlike BDB). This makes Bitcoin Core's wallet more up to date and resilient.
Around this time, two new standards, or BIPS, were being worked on. The first was BIP87: Hierarchy for Deterministic Multisig Wallets. I was actually the author of this BIP, and it was the adoption of descriptor wallets that was one of the biggest motivations for the BIP. Many wallets were still using derivation standards that were script-dependent, which did not make sense for descriptor wallets. Some even had a `cosigner_index` level in the BIP32 path. To solve this, we defined the following 5 levels:
m / purpose' / coin_type' / account' / change / address_index
with the purpose being set to the constant 87'
The other BIP that was being worked on was BIP 129: Bitcoin Secure Multisig Setup (BSMS). While PSBTs standardized the format for partially signed transactions, there was no standardization for how to secure this data, to make sure it had not been tampered with, leaked, etc. This had been a problem for a while, and had resulted in many hacks. There is much data that has to be verified - the multisig configuration, signer membership, script type, derivation paths and number of signatures required; and this has to be correct, not tampered with, not leaked, and persist in a manner that other signers can understand. Otherwise this can result in the theft or ransom of funds, or loss of privacy.
BIP 87 was merged on March 11th, 2020. Very soon after, BIP 129 was merged on November 10th, 2020.
However, there are still many problems left to solve. One of the major issues with multisignature transactions is that when spent, they stand out on the blockchain. Here I am copying from one of the first paragraphs in this post, which I wrote on about P2SH (how multisignature transactions were first widely used):
"The next major address type was Pay-to-Script-Hash (P2SH)...With this type of address, the receiver can generate a set of locking conditions...User B generates a set of locking conditions, hashes them, and encodes them as an address. Then User A sends the coins to this address. When User B decides to spend, they must supply the full set of spending conditions...and a signature"
You can view a history of P2SH outputs in the following picture:
In the picture above, you see that the largest portion of the graph (green), is marked as unspent. These are addresses that have received bitcoin but not spent them yet. You are unable to see what type (m-of-n multisignature) those addresses are, because they have only been 'received to' ("User B generates a set of locking conditions, hashes them, and encodes them as an address"), and not 'sent from' yet ("When User B decides to spend, they must supply the full set of spending conditions"). As you read in my GPG explanation - you cannot track a hash backwards to find it's full data. The 'received to' bitcoin are User B's hashed set of locking conditions.
However, when those bitcoins are spent, "they must supply the full set of spending conditions". That is why on the graph you can see the history of 2-of-3, 2-of-2, 3-of-5, 3-of-6, and so on multisignature addresses, (they are outputs - they have been spent). Even worse, note that I said, "they must supply the /full/ set of spending conditions.
So let's say that through more complex scripting, User B set their locking conditions to: [(2 of Sig X, Sig Y, Sig Z) OR (Sig Y + Sig Z + 90 Days) OR (Sig Z + Password)]. This is possible with bitcoin - it is programmable money after all. It doesn't matter which condition ended up being the unlocking script; the entire set of conditions must be placed publicly on the blockchain. This means the whole world would be able to see how you are securing your coins, and would likely be able to track your UTXOs through the blockchain (I doubt there are many other people with that exact set of locking scripts).
Taproot, through a number of combined technologies, set out to fix this issue (among others). Taproot was first proposed on the bitcoin-dev ML by Greg Maxwell on January 23rd 2018, was locked in at block 687284, and activated on November 14th, 2021 (enforced by Bitcoin Core 0.21.1). MAST (Merkelized Alternative Script Trees) is one of these technologies that makes up the Taproot soft fork.
A Merkle Tree is a mathematical structure that hashes different sets of data (here, it will be P2SH scripts), into a single, compact hash known as the Merkle root. If any of the data in the Merkle tree is known, the Merkle root can be used to verify that that specific data is really somewhere in the Merkle tree, even if not all data in the tree is known. This solves the issue of having to include the entire set of locking coniditions in the final spending transaction.
For example, look at the below picture:
As with normal P2SH, the full hash is already known - this is the address (and in this example, is Hash(1,2,3) ). When going to spend, rather than revealing all possible scripts, all that is necessary is the spending condition used, and its location in the tree (its Merkle Path). So if the 2 of 2 Schnorr multi-signature is used to spend the bitcoin, the spender will reveal that script, Hash(1,2) and signature over the data. None of the other spending possibilities are revealed. Nodes on the network merely have to calculate the Hash of the script used (2 of 2 Schnorr multi-signature) -> Hash(3), hash it with the rest of the hidden data (Hash (1,2) ) -> (Hash (1,2,3) ), and ensure that it equals the merkle root/address originally presented (in this scenario, it has). If so, the signature should be valid as well, and the bitcoin can be spent.
This is revolutionary, because bitcoin can now be locked up in complex trees of spending scripts, but stored in short addresses that all other nodes don't have to know or understand - and when spent, only the spending condition used has to be revealed!
In the previous paragraph, I introduced a topic that has yet to be discussed - Schnorr signatures. That is another technology that makes up the Taproot update. Bitcoin requires the use of a digital signature algorithm, and has used ECDSA (Elliptic Curve Digital Signature Algorithm) in its entire history up to Taproot. Schnorr signatures offer a number of improvements over ECDSA, such as lack of key malleability, linearity, improved verification speed, and the existence of a more formal proof. Because of Schnorr's linearity, multiple keys can be 'added' together. For example, in a multisignature transaction (with 1 input - this does not yet work cross-inputs), Alice, Bob, and Charlie can all add up their public keys to make a 'David' public key; and when signing, combine all their private keys, in order to combine all of their signatures, into a 'David' signature - and the public viewing the blockchain has no way of knowing whether or not an individual named 'David' himself just signed a normal transaction, or Alice, Bob, and Charlie made a 3-of-3 transaction. (The cryptographic protocol that allows this is called MuSig(2)).
With Schnorr, a key can also be 'multiplied'. So for example: x(David's public key) can be signed by x(David's signature) - it doesn't matter what x is! It all looks the exact same. Very cool. So if you replace 'x' with the Merkle Root discussed in the previous paragraphs on MAST, this allows the combination of both technologies. Let's make an example:
You have Alice, Bob, Charlie in a multisig transaction. The spending policies are: [(Alice + Bob + Charlie) OR (2 of Alice, Bob, Charlie) OR (Bob + Charlie + 90 Days) OR (Charlie + Password)]. When Alice+Bob+Charlie combine their signatures into one, we will call that 'David'.
So if all three individuals decide to sign cooperatively, essentially, their public keys are combined into one and their signatures are combined into one, and it will look like 'David' signing a regular transaction. If that cooperative close never occurs, you also have a backup MAST structure of spending policies; these are the remaining spending conditions: [(2 of Alice, Bob, Charlie) OR (Bob + Charlie + 90 Days) OR (Charlie + Password)]. This entire set of conditions is hashed into a tree, upwards until a Merkle root is reached, just like in the MAST example. That root is then used to multiply 'Davids' public key and signature, just like in the [ x(David) ] example. This has no affect on the signing when Alice, Bob, and Charlie all agree to sign the funds in a 3 of 3 transaction (because x(David) looks the same as David signing transactions, which also looks the same as Alice, Bob, and Charlie signing transactions together).
After the merkle root is used to multiply 'Davids' public key and signature, this means that when any one of the three spending conditions in the tree needs to be used , to activate the MAST structure the following need to be placed on the blockchain: David, the Merkle Root, and the spending script used. For example, all that needs to be revealed to spend with Charlie+Password is (David+Merkle_Root+'Charlie_Password_Script'). The combination of MAST and Schnorr allows many scripts to be set up, that when spent in the best case scenario it looks like a normal transaction, and in the 'worst case' scenario, only the one used needs to be revealed. This is still compared to P2SH/P2WSH, which had to reveal /all/ scripts!
Everything above is a good summary of all multisig related technologies up to the current date. So what is being worked on now? The technology most likely to be merged next is called Miniscript. Bitcoin uses a scripting system for transactions, which is procedural, stack based, and processed from left to right, with no loops. But that does not make it easy - we are nowhere near close to realizing Script's full potential, and even writing scripts for what may seem like easy applications can become very complex. It is hard to verify their correctness and security. A lot of the time, it takes a lot of ad-hoc development work, which results in brittle template-matching for applications and lots of custom code for each specific use case. Miniscript solves this! Miniscript allows wallets to be much more dynamic about the scripts they use. It is a language that allows for more comprehensive and compatible use of Bitcoin Scripts, which enables generic signing, composition, and analysis. Miniscript was invented in the Summer of 2018, and is now split into 3 separate PRs in Bitcoin's repo, one of which is already merged.
Here is an example:
A user and a 2FA service need to sign off, but after 90 days the user alone is enough.
<key_user> OP_CHECKSIGVERIFY <key_service> OP_CHECKSIG OP_IFDUP
As you can see, miniscript is much easier to read.
There is then essentially a laundry list of PRs and issues that need to be fixed for the 'perfect' multisignature wallet to be able to be set up through Bitcoin Core's GUI. They can be tracked through this issue on Github: https://github.com/bitcoin/bitcoin/issues/24861
These are essentially:
*Once miniscript is merged, taproot support will have to be added in.
*BIP87 will need to be merged into Bitcoin Core (support for 1-of-N keys also).
*Descriptors support for both change and receive addresses in a single string
*QR Code Scanner
*Point out errors in bitcoin address
*Warn when sending to an already used address
*MuSig2 (the protocol that enabled Alice,Bob,Charlie = David) needs to be added to PSBT
*MuSig2 needs to be added to Bitcoin Core
*GUI Support for everything in Bitcoin Core
I have a $10,000 bounty to help implement all of this.. A signed statement of mine can be viewed here.
I want to thank all of the many hardworking and dedicated Bitcoin developers who have made all of this possible.
Copyright © 2022 Robert Spigler - All Rights Reserved.