Rootstock (RSK)’s Peg-out efficiency improvement — Segwit (part 2/3)

Published in

RootstockLabs: Research & Technology

11 min readOct 18, 2022

Co-authors: Ramsès Fernàndez-València and Nicolás Vescovo

In the previous post on our RSK’s Peg-out efficiency improvement series, I described the main actors of the architecture, the limitations of the current design and a brief introduction to Segwit versions.

In this post, I go even more in-depth and describe the proposals studied: Segwit v0 and Segwit v1/Taproot (FROST, ICE-FROST and MuSig2). And I also explain the modifications required to implement each scheme.

In the next and last post, I will wrap up with the most interesting part: a comparison of the proposals and a conclusion.

Studied Proposals

Segwit v0

This proposal only requires minor changes to the current RSK Peg implementation. That is why we will not do any specific analysis in this section as it won’t require any significant change in the signature scheme. It is still a regular “m of n” multi-signature scheme in which a group of m participants signs the transaction, and the data published on-chain is n public keys and m signatures.

The implementation of this proposal involves moving the redeem script and the signatures to the witness data section, which drastically reduces the fee cost of the transaction.

The main advantages and drawbacks of this mechanism will be discussed in detail when compared with the rest of the proposals.

MuSig2 (Segwit v1)

Musig2 is a multi-signature protocol for Schnorr signatures. The proposal MuSig2 cannot be considered a naive multi-signature scheme as it shares some similarities with aggregate signature proposals. Although MuSig2 is not a threshold, thanks to Taproot MAST (Merkelized Alternative Script Tree) feature, it could be adapted to be one

Amongst the highlights of the scheme, one finds:

It’s secure under parallel signing sessions.
The possibility of key aggregation.
It outputs ordinary Schnorr signatures.
Reduces the communication rounds to two, in contrast to the earlier proposal MuSig, which requires three rounds.
Has signer complexity similar to ordinary Schnorr signatures.

In MuSig2 each participant has a pair of keys, one private and one public key. They are generated by taking x ∈ ℤ at random and computing X = gx for g a generator of a cyclic group G.

The scheme uses two signing rounds to produce m intermediate signatures for the m users, and a final combination of these signatures creates a single final signature. This signature is verified with a public key generated by the combination of the public keys of the users.

When this scheme is used to authenticate a blockchain transaction, the only data published on-chain is one public key and one signature, which is indistinguishable from a usual Schnorr signature, therefore the costs are much lower than a multi-signature and equal to those generated by a threshold mechanism. Furthermore, the key generation in MuSig2 is simpler than the key generation in threshold schemes.

Like I mentioned at the beginning, although MuSig2 is not a threshold, thanks to Taproot MAST (Merkelized Alternative Script Tree) feature, it could be adapted to be one. So instead of one public key generated by the combination of the public keys of the users, we could build a Merkle tree of keys with the combination of the signers grouped by m .

The idea of including a Merkle tree with the groups of admissible signers can be used with MuSig2, BLS, or any other multi-signature scheme. This feature is included in Taproot under the hood of Merkelized abstract syntax trees (MAST), which are a method of using Merkle trees to store the various user-selected conditions that must be fulfilled for the encumbered bitcoins to be spent.

When we compare MuSig2 threshold variant with other threshold signature schemes, we can identify a negative aspect against it which requires exactly the chosen m signers to participate (m = n) and makes MuSig2 a rigid solution. If one of the signers fails to sign, then another group of m signers must be chosen and start the process all over again.

FROST (Segwit v1)

The threshold proposal FROST uses a semi-trusted signature aggregator (SA) whose role allows signers to reduce the network overhead. The aggregator is used for coordination and it doesn’t have access to privileged information. The SA role can be played either by a participant in the protocol or by an external 3rd party. The SA is trusted to identify misbehaving participants and is also responsible for publishing the group’s signature at the end of the protocol. In case of a misbehaving SA, the protocol remains secure against adaptive CCA. A malicious SA has the power to perform DDoS attacks, or falsely report malicious participants, but cannot learn the private key or cause improper messages to be signed.

Key generation in FROST is done in 2 rounds and relies on Pedersen’s distributed key generation algorithm, whereas the singing algorithm builds upon additive secret sharing.

Furthermore, signing operations make use of binding techniques to avoid forgery attacks. The signing process consists of two phases: a pre-processing stage and a single-round signing phase, but one may combine them to get a single-round signing phase.

In the FROST scheme the only thing a malicious actor or signer could achieve is the invalidation of the complete process by sending an invalid signature in the last step. It cannot cheat other participants nor the SA.

ICE–FROST (Segwit v1)

To improve the robustness of FROST, an Identifiable Cheating Entity (ICE) mechanism is introduced that allows the identification and exclusion of malicious participants in FROST.

The ICE mechanism uses symmetric encryption algorithms at the cost of more interaction between users. In the proposal, the shares are encrypted using a shared secret key derived from Diffie-Hellman key exchange between sender and receiver.

In the ICE mechanism, each participant issues a complaint against a malicious action instead of aborting the protocol. The complaint must be validated by honest users.

In practice, the sender establishes a secure channel by encrypting shares and broadcasting the encrypted values. One way to prove that the sender of a share has cheated is for the receiver to reveal the received encrypted value to every participant by revealing the decryption key. This approach requires the decryption key to satisfy the properties below:

The decryption key associated with the communication of each pair of users should be unique.
The decryption key should be verifiable by other participants.

Remarks on the Proposals

Depending on the scheme chosen, the situation changes:

Choosing FROST avoids the requirement of using a Merkle tree containing the information of admissible signing groups since it is based on secret sharing: all the possible signers (n) can participate in the secret-sharing phase, and then the system requires enough pegnatories (m) to perform the signature.
It is well known that FROST is vulnerable to malicious participants (that would force abortions in the process) but detecting them can be done at the early stages of the protocol: in the key generation stage, the end of the first round requires a sequential checking that would report an error and the author of that error. Similarly, in the signing process, the third step of the algorithm requires a similar sequential checking. In the worst case, when the signature aggregator comes into the game, it will perform one further check, at the end of the signing process.
The possible main drawback of FROST is the key generation phase, which relies on secret sharing. This stage may present difficulties at the implementation level, although the underlying idea is not hard. In particular, the interaction of several participants may be hard to manage.
Choosing MuSig2 will require using a MAST, containing the admissible signing groups. Computational complexity implications of turning MuSig2 into a threshold mechanism follow:

Let n be the number of participants and m the threshold.

The number of elements contained in the above Merkle tree is:

The space complexity of this tree is exponential on the threshold:

Searching, traversing, inserting or deleting an element in a Merkle tree has time complexity:

Implementing FROST, ICE-FROST and MuSig2 schemes

FROST

Implementation

The critical part of implementing this scheme in RSK is concerning the PowHSM, which is why we will need to focus on it.

The FROST first round of signing is about generating what we call the random R key par. For that to happen, each of the pegnatories has to generate one random key pair (r, R) for each of the UTXOs in the transaction (because what they sign is each of the UTXOs). RSK’s new version (HOP 4.0.0) includes “peg-out batching” (RSKIP-271) which helps reduce some of this effort by grouping multiple peg-out requests into a single transaction.

The random R is shared between the rest of the pegnatories by publishing it to the Bridge. Once all the required actors publish their proper R to the bridge, the signing process or the second round begins.

In the second round, every pegnatory will receive the addition of all R’s, named “R+”. Its PowHSM will take this R+ related to the specific UTXO, recover the (r, R) generated in the previous step, and sign the UTXO. Finally, the pegnatory will be in charge of publishing that signature to the Bridge, which will ultimately build the final signature after validating them.

Failure Flow

During this process, it could happen that one of the actors (pegnatories/PowHSM) doesn’t generate proper output. As a consequence, the whole process is interrupted. The malicious actor could be easily identified and probably banned (i.e.: 1000 blocks penalty, where it cannot process any transaction).

Setup

The initial setup of this schema is interactive, which means that the participants have to exchange data through the Bridge. It is important to remark that this should be on chain interaction, since the pegnatories are not connected between each other, and the only way to interact is through the Bridge.

The participants generate a shared secret whose input is the number “m” of the threshold (m of n). Once all the participants have shared their secrets, the Bridge can generate the final PK, which will be part of the bitcoin script (from which the taproot address of the threshold is derived).

Feasibility

We concluded that while it is feasible to implement the whole scheme, there are a few concerns to be cleared if we desire to implement a POC:

The random number has to be a secure random number (not possible to generate the same twice)
Once we know what data structure PowHSM needs to validate, we have to check the feasibility of implementing new types of parsers.

ICE — FROST

This schema variation improves the robustness of the previously explained schema. In terms of implementation, it is slightly different but not significantly different. It means that if we can prove that it is feasible (by some PoC) to implement the previous schema, it is most probable that this other schema is also doable.

One of the variations resides after the first round. There is an extra step that checks the validity of R+. If a participant did not generate a valid R, that part of the process is automatically cancelled and restarted from scratch.

The second and primary difference is during the second round. If a participant tries to cheat or generates a corrupted signature, the whole process doesn’t necessarily have to restart entirely but just from the second round.

That makes the entire process more robust without critical changes needed. It adds some new checks in the Bridge and a way to restart the second round within the Bridge, which we consider feasible and not so complex to implement.

MuSig2

Implementation

Musig2 schema is much easier to implement than FROST but less effective and less performant than the latter.

For PowHSM it becomes much more straightforward. It doesn’t need any random generation or extra data structure to relate key pairs with UTXOs. It only receives the UTXOs and the revealing path (from the MAST path correlated to the group that is going to sign) and returns the UTXO signed. Which makes it simpler and has fewer steps.

The complexity resides in the Merkle Tree with the combination of all possible threshold groups from all the pegnatories. That logic has to be implemented in the Bridge contract along with the group selection once the pegnatories express they want to sign.

Because of having to reveal the path to the group and the signature to spend each UTXO, the revealing data becomes heavier than FROST.

Using the same structure of rounds as we explain FROST, we can say that the first round of Musig2 is only for showing purpose from the pegnatories to be selected as one of the signers by the Bridge.

Once the group is selected (the Bridge could build it pseudo-randomly based on the PoW of the previous block), each of the pegnatories chosen takes the transaction and the path to the selected group inside the Merkle Tree of possible groups and passes it to their PowHSM. Each pegnatory’s PowHSM signs all the UTXOs and returns each signature to the pegnatory to be published again to the Bridge.

Finally, the Bridge builds the final transaction with each pegnatories’ signatures and leaves it ready to be published into the Bitcoin network.

Failure Flow

During this process, it could happen that one of the actors (pegnatories/PowHSM) doesn’t generate a valid output and, as a result, the Bridge could decide to abort the whole process.

Also, The malicious actor could be easily identified and probably banned (i.e.: 1000 blocks penalty, where it cannot process any transaction).

The need for a restart is due to how Taproot script path signing is defined. The signature commits to the script selected and the MAST proof. It means that the signature only works for the selected group. If we need to change the group, we must generate a new signature again.

As an alternative, a way of improving the process could be to anticipate a possible failure and select a backup group. Although it could reduce the risk, it would still make the process more complex and less performant in terms of computation, communication and size of the messages. And there is a big chance that the malicious actor is part of both groups.

Setup

The initial setup for building the Merkle tree does not need to be interactive. Each participant publishes their PK. Then the Bridge builds the tree and derives the taproot PK and pay-to-taproot address.

Building the tree means combining each of the participants in all possible ways of the threshold: