On predictive UTXO management

A few years back, the predecessor to what is now the Simplexum Payment Engine was simply called ‘btc-wallet’, and one of the problems that it was facing was that with a coin selection algorithm that preferred to build cheaper transactions, small UTXO tended to accumulate.

This meant that when you needed to send bigger payment, the resulting transactions were large, as they had to bundle all that small UTXO, and you might even have to send several transactions to satisfy the requested amount, due to transaction size limitations.

Sending big transaction means that you have to pay a larger fee to miners, and as you usually have to send it right now — waiting for fee to decrease was often not an option.

Organic consolidation

An obvious solution was to perform periodic UTXO sweeps at a time when the network fee is low, but this would mean it would have to be done manually, and you would ‘freeze’ the amount in the sweep transaction until it confirms, or you will need to send your payments using zero-confirmation coins, which may also be undesirable.

To address this, I devised and implemented a method to organically perform UTXO consolidation: when network fee is low, transactions that are created tend to use more UTXO to send the same amount, and when the network fee is high, transaction size will be smallest, to conserve the fee.

This has shown good results over the years, and you only need to monitor the median fee in the network, and adjust your ‘fee per full coin’ setting periodically, when the median fee level shifts for a prolonged period.

I recently published our method in a post¹ on reddit.

Predictive management

The response from the community was positive, so I decided to do a google search to see if the link to the post was shared anywhere else on the internet. I discovered that BitGo seems to have implemented what looks like a similar method — they call it ‘Predictive UTXO management’ in their press release:

Predictive UTXO Management reduces the overall cost by minimizing transaction sizes at high fee rates, while automatically sweeping up and processing many small fragments of coins when fees are low. As a result, more coin fragments are spent at lower fee rates, reducing transaction fees overall

That’s great, as this means more wallets will have their UTXO set optimized, and the global UTXO set will be healthier. I encourage more wallets and wallet providers to implement these types of optimizations.

Details for the method BitGo announced are not available, and it may be more sophisticated than our method. I wouldn’t call a method that just deals with UTXO consolidation ‘Predictive’, hence I chose ‘Organic’.

With the formula we use:

optimal_tx_vsize = utxo_sum * fee_per_full_coin / fee_per_kb

You may base your choice of fee_per_full_coin on the historical data for the median fee, or try to predict the future median fee, but you will be doing this prediction as a separate action, that is not inherent in the formula. And such a prediction will be about future UTXO consolidation performance, and not directly about fitness of your UTXO set for future payments.


What would be ‘Predictive’ ?

Here’s my view on how a system for UTXO set optimization might work, which I would call ‘Predictive’.

This will not contain definitive answers, but I will try to reason about the ways this problem can be addressed, to lay out directions of further research for our team², and to incite a discussion about the topic.

Context

The presented view will be in the context of a service that accepts and sends bitcoin payments, and wants to minimize network fee paid in the course of its operation.

The payments this particular service handles most likely will have a certain pattern. It will depend on the type of the service, on the behavior of their customers, on the payment sending policies, and other factors.

Number and value distribution of the payments will vary depending on the season, day of the week, time of day, and market situation. Market situations are hard to predict, but payment activity characteristics that depend on time periods can be modeled with historical data, which each service most certainly collects, and the predictions can be based on that model.

UTXO sources

The source of UTXO for a wallet of a service that both accepts and sends the payments are:

  • Incoming payments from their customers
  • Change outputs from outgoing payments
  • Top-up of hot wallets from cold wallets

You generally cannot control what payments your customers send to you — but you can try to predict the pattern of upcoming deposits: the number of incoming coins, their values, and their timing. And you can build a probability distribution from that.

You can completely control the transactions from the cold wallet, but the nature of the cold wallet is such that you want to touch it as rarely as possible, and accessing it just for potential fee savings is probably overkill.

The most suitable process for adjusting the UTXO set value distribution is when you build a transaction for an outgoing payment — you have some degree of freedom in what UTXO you will consume to collect the required sum, and the value of new coin you will send back to your wallet, as change output.

Optimization

The most economical way to spend Bitcoin is when you use the least acceptable number of inputs, and do not generate a change output, or generate a change output that will not be expensive to spend in the future. The larger the UTXO value is, the cheaper it is to spend, relatively to the value. On the other hand, generating big change outputs means that a bigger portion of your balance will be in a zero-confirmation state.

When you worked out the probability distribution for upcoming outgoing payments, you can choose the most probable value, and check if there are UTXO in your UTXO set that will fit to satisfy that out-payment. If the next payment will be sent soon, then it is likely that the network fee will not move that much, and you can even predict the exact value of UTXO to create as change output, so that you can send the next payment with a transaction of optimal size.

That change output will likely have to be spent before it confirms, as the next payment may come soon, and in this case, you have to watch out for ‘too-long-mempool-chain’ errors from your bitcoind.

You will not always hit predicted conditions, but if the probabilities for the target UTXO sizes are high enough, you may hit good fits often enough to save on fee.

You can create probability distribution for next N out-payments. N will be dependent on your UTXO churn — how fast UTXO set in your hot wallet completely changes. For payments that are not immediate, you can afford to wait for your change output to confirm, but your predictions may not be as good. This is because you need to predict network fee more far in the future, and new UTXO from incoming payments will also be a factor.

This uncertainty means you will have to store a range of slightly different UTXO that may fit your most-probable out-payments depending on the future situation.

UTXO Stacks

You can go with an approach of having ‘stacks’ of different-sized UTXO for the most probable out-payments. The UTXO in each ‘stack’ can all be slightly different to account for fee variation. The values of UTXO can also be the same within each ‘stack’, but then you will have to use some ‘free-standing’ UTXO in your wallet to make up for fee variations.

Same-sized UTXO is convenient, in that you can even top-up these ‘stacks’ when you send the funds from your cold wallet, and maybe, if you analyze your coin selection results over time, you can decide what UTXO sizes you most certainly want to have in your set.

The approach of analyzing the coin selection results is interesting in that there is much less uncertainty, because you are working with existing data, and not making direct predictions.

You can ask a question, “Historically, when we built the most economical transactions, what is the UTXO size that turned out the most convenient?”, and after an analysis, you could say, “For best results, based our historical usage patterns, we need to maintain these number of UTXO of these sizes”.

Potential risks

Storing extra UTXO in hot wallets to make building optimal transaction easier is a viable approach, but it also increases risk: you are storing more bitcoin in your hot wallet than necessary. Even if you implement spending limits for your hot wallet and secure it to the max — it is still an online wallet. So you need to consider your risk tolerance when going with this strategy. You also need to make sure your coin selection algorithm treats these ‘special’ UTXO differently from others, otherwise it may spend them before the payments they are destined for are due, and potential fee savings will not realize.


Summary

In my view, the approaches to predictive UTXO management may be:

  • Build probability map for future payments, and make your change outputs fit that map. This is the most iffy and uncertain approach.
  • Build a ‘most useful’ UTXO size map based on your coin selection’s ‘best hits’, and try to maintain the presence of these useful UTXO in your set, with change outpus or periodic top-ups
  • Maintain ‘stacks’ of fixed UTXO sizes for most frequent payment amounts, and use a variety of free-standing UTXO in your wallet to cover the difference in network fee since the time you calculated these fixed sizes.

Consider your risk tolerance for an increased hot wallet balance due to the ‘special’ UTXO in it, and try not to spend them prematurely.


Footnotes:

  1. ^ Organic UTXO consolidation (in the course of regular payments): https://www.reddit.com/r/Bitcoin/comments/9kxn01/organic_utxo_consolidation_in_the_course_of/
  2. ^ The main focus for Simplexum Payment Engine has been the elimination or mitigation of the risks of theft (by hackers or malicious employees), flexibility for different business use cases, and building abstractions that allow developers who use its API to not bother about underlying cryptocurrency protocol details. As we grow, we certainly will have to increase the efforts on coin selection efficiency improvement, and that would probably mean assigning a dedicated team for this research.