I have 7 Hufflepuff on my account and you have 3 Hufflepuff on your account. Then I send you 1 Hufflepuff.
You’re already familiar with the account model, this is how the banking system works, this is how Ethereum works, but this isn’t how Bitcoin works. Bitcoin uses UTXO model instead. Why? That’s an interesting topic, but it isn’t relevant for this article, so just take it as is.
I have 7 bitcoins in my wallet and you have 3 bitcoins in your wallet. Then I send you 1 BTC.
I built a transaction that has one input: 7 BTC, and two outputs: 1 BTC, which is yours and 6 BTC, which is a change coming back to me.
Now you have two coins: 3 BTC and 1 BTC. Let’s call these coins Transaction Outputs — TXOs. In fact these coins are unspent, so we may be more specific and call them Unspent Transaction Outputs: UTXOs.
Coinjoins are collaborative Bitcoin transactions. Multiple people are participating with inputs in a transaction and that’s what we call a coinjoin. Let’s make one!
But there’s a problem. It’s trivial to tell the sub-transactions by correlating the amounts:
6 -> 6
3,1 -> 4
Chaumian Coinjoins, specified and implemented by the author of this article defend against amount correlation by pre-agreeing upon a common denomination: 3 BTC.
This way nobody can tell where the 3 BTC UTXOs are coming from. Although we have privacy now, this is still not perfect. This suffers from three main problems.
- Change Creation: With amount correlations, passive observers can tell that the 1 BTC change output comes from the 1 BTC input.
- Blockspace Efficiency: Chaumian Coinjoin algorithms create a lot of outputs.
- Limited Transaction Structure: Because of design limitations, the coordinating party of Chaumian Coinjoins can also tell that the 3 BTC and 1 BTC inputs were owned by the same user.
To tackle the coordination issue, we came up with a new research, called WabiSabi, which enables us to create any kind of coinjoins without privacy leaks. We can now add as many inputs and create as many outputs as we want to and we don’t even need to agree upon common denominations. At least not for coordination reasons.
This is far from trivial. Two years of bloody cryptographic research went into solving this problem: how to create trustless coinjoins with arbitrary amounts? In the beginning I didn’t think it’s even possible, but one thing followed another and viola, it’s solved! Now we have complete flexibility regarding coinjoin amount organization, so what do we do with such great power?
So far we worked on well-defined problems. Centralized mixers were stealing user funds and can deanonymize them...so here comes coinjoins to the rescue. Coinjoins are deanonymizable by amount correlations...so here comes Chaumian Coinjoins to the rescue. Chaumian Coinjoins are cryptographically limiting the coinjoin transaction structure...so here comes WabiSabi to the rescue.
And now we bumped into the unfathomable complexity of the real world. Countless variables lead to ill-defined problems. There was no logical way to progress forward from the bottom up, so a strategy change was needed. We decided to apply the oldest tool humans have: intuition, to come up with many intuitive algorithms to structure coinjoins and see which one yields the best results.
Dynadenomination Coinjoins utilize not only multiple denominations, but also multiple denomination systems, randomly and dynamically. I expect this scheme to be improved until Wasabi 2.0 release, ideas and feedback is very much encouraged! I think this already has all the techniques which we’ll apply, so it’s developed enough to talk about. First let me describe the algorithm, then proceed further with analyzing it.
We created a simulation of our dynadenomination coinjoins in the following way:
- First we took a few months of real world Wasabi Wallet fresh coinjoin inputs. Fresh signifies these have never been mixed before. These are the amounts users come into mixes in the real world.
- We continued our preparation by randomly grouping these fresh amounts together and called these groups users, then executed our mixing algorithm.
- Then we did the same grouping again, but only 70% of the simulated inputs came from fresh amounts, the rest came from the result of the first coinjoin simulation. This is how remixes were simulated. 30% is also based on real world Wasabi Wallet remix statistics.
- Finally executed the mix and wrote out the results.
We can see, in our simulation 100 user participated with 300 inputs altogether. Some user participated with less than 3 inputs, some did with more. In the real world we cannot tell the number of users, but this is a simulation, the simulation created these groupings, so we know exactly the links because of that. This coinjoin resulted in 404 outputs, with the exception of two outputs, all of them gained a mathematically verifiable anonymity set. In average an output has 5 anonymity set. This would be already cool by itself, but the privacy guarantees don’t stop with naïve anonymity set calculations. But first, let me describe the above algorithm.
Imagine you’re a user. You know the group of input amounts you’re participating in the coinjoin and you know the input amounts of others.
You also create a big list of denominations from multiple denomination systems. In this case we’re working with powers of 2 (
1, 2, 4…), powers of 3 (
1, 3, 9…), powers of 3x2 (
2, 6, 18…), powers of 10 (
1, 10, 100…) and preferred value series (
1, 2, 5, 10, 20, 50…)
To finish the setup, let’s create a big list to be our possible denominations:
In order to decide what outputs you want to create, first create a frequency table by greedily breaking down each and every input amount with our multidenomination system. For example, the inputs
29 would be broken down into
27, 1 and
27, 2 respectively. In that case our frequency table holds the following elements:
27:2 . Let’s agree that the current coinjoin round’s denomination will be those denominations that have occurred at least twice in the breakdown. In our example it’d be only
27, which not only shows that the example is too simple, but also that this algorithm doesn’t work for very few inputs, it only starts to make sense from about 50 inputs.
Finally, take the sum of your inputs and break it down into the selected frequent denominations, but this time not greedily, but rather somewhat randomly. For example if we found frequent denominations
1, 4, 6 and the sum of our inputs is
8 then greedily you’d break it down to
6, 1, 1. But with some optimization, you could find that breaking down your sum into
4, 4 would be better, because it creates less outputs and that’s highly desirable, because blockspace is scarce. However finding optimal breakdowns is an exponentially complex problem, so we can only use heuristics here and these heuristics should be spiced with some randomness for privacy reasons.
Privacy Guarantees Of Wasabi Wallet 2.0
Wasabi 2.0 inherits the security and privacy guarantees of Wasabi 1.0, which is: nobody can steal your coins, the coordinator cannot correlate your inputs with your outputs and the naïve anonymity set guarantee, which is what you can see on the blockchain, with your own eyes, you can count the number of equal outputs that nobody can tell apart.
Sub-Transaction Privacy Guarantee
However Wasabi 2.0 seems to weaken the anonymity set privacy guarantee by creating smaller equal amount outputs. Wasabi 1.0 ideally creates 100 equal outputs, Wasabi 2.0 does much less. At first it may seem like a privacy tradeoff to avoid change creation and inflating outputs, which it was intended to be that at first. But upon a deeper look we observed the privacy guarantees are much larger, because of the huge number of sub-transactions. Take an output from the following output group:
There are 7 occurrences of 0.01 BTC output.
You may think the anonymity set of the chosen output is 7, but in reality it’s just a minimum estimation. It’s much higher than that! It could potentially come from many different inputs, not just 7 of them.
To get a sense of this, take the following transaction:
1, 1, 2, 4 -> 2, 2, 4
4 only appears once on the output side. It's naïve anonymity set is 1. But it could also be that
1, 1, 2 came from the same user. Therefore its anonymity set is 2, because there are 2 different groupings of the inputs where the output could have come from.
The Dynadenomination Coinjoin algorithm started by breaking down each and every input into denominations and took the most frequent ones. This was not only helpful in achieving probabilistic equalities on the output side, but these are also great numbers to increase their combinations such as they add up to many input combinations. In other words the final outputs end up adding up to more valid sub-mappings (sub-transactions) than if they were to be chosen randomly. Furthermore I’m talking in exponentials here. The combinations of inputs and the combinations of outputs are huge numbers.
Computational Privacy “Guarantee”
Fully analyzing Wasabi 2.0 coinjoins is computationally hard and will probably be impossible for decades to come because a combinatorial complexity explosion is happening when we try to find all the sub-transactions of a Wasabi 2.0 coinjoin.
A brute force algorithm would have to
- Take all the possible combinations of how inputs can be grouped together:
2^number of inputs.
- Take all the possible combinations of how outputs can be grouped together:
2^number of outputs.
- Finally check for equality of every combination to find a sub-transaction. This would take
2^number of inputs * 2^number of outputsiterations.
In the case of our example transaction: 100 inputs, 300 outputs, it’d be:
In a 2017 research paper, Anonymous CoinJoin Transactions with Arbitrary Value, the authors measured the processing time it took them to find all valid mappings of coinjoin transactions: forever. Later, but independently, the guys working on CashFusion observed the same.
That’s all I note here, because our cryptographers will kick my ass if I try to make any claims on Wasabi relying on computational privacy. Nevertheless I still consider this a spicy addition.
Wasabi 2.0 provides decently verifiable privacy based on naïve anonymity set calculations and a huge probabilistic privacy based on the inflation of sub-transactions. But this is unanalyzable because of the spicy third guarantee: computational privacy, which is theoretically unsound, but practically relevant.