What The Heck Is Schnorr
* History of Schnorr
* ECDSA Deep Dive
* Schnorr in Detail
* Benefits of Schnorr
“Bitcoin is nothing more than a chain of digital signatures”
- Citadelian Proverb
The Bitcoin system, or in general any cryptographic currency has to revolve around the concept of digital signatures. The Blockchain ledger encodes the ownership of some kind of digital assets. Unlike the traditional notion of ownership, where it is enforced through legal documents and physical signatures, in digital currency, ownership is proved by a chunk of digital data. The mathematical magic of cryptography is used to create these digital “proof of ownership” mechanisms, aka Digital Signatures.
There are many varieties of Digital Signature algorithms available in the literature, with their advantages and disadvantages. Bitcoin uses the algorithm known as ECDSA (Elliptic Curve Digital Signature Algorithm) which was chosen by Satoshi Nakamoto while designing. Currently, the biggest proposed protocol update being discussed in the community is named Taproot, which is a collection of many upgrades. Among them, Schnorr is a new digital signature algorithm that is being proposed. It has many added advantages that ECDSA lacks. Here in this piece we will explore the idea of digital signatures in-depth, look inside both ECDSA and Schnorr, and judge on our own the special merit of adopting Schnorr signatures.
Back in the day when Satoshi was coding up Bitcoin from scratch, one big component of the design was the choice of Digital Signature. Satoshi probably chooses ECDSA because of existing standardization. Moreover, ECDSA was already natively supported by OpenSSL, a crypto library on which Bitcoin codebase was heavily dependent. Also, ECDSA is more compact in size than other methods like RSA, which is a very important consideration when designing blockchain systems. But Schnorr signatures are much more elegant and simple, and it has one more magical property; linearity. Because of this property, more cool stuff can be done with schnorr signatures, than ECDSA cant.
Schnorr signature was invented by Claus-Peter Schnorr back in the 1980s. His digital signature implementation was much simpler than contemporary other algorithms. But he did the worst thing that can ever happen to any kind of knowledge. He patented it before he published the paper. Because of his patent, the Schnorr signature algorithm did not see any widespread use for decades. In fact, ECDSA’s predecessor, DSA was created specifically to bypass the Schnorr patent.
In 2008, nearly two decades after the publication, the patent expired. But because of this horrible law of intellectual property, the algorithm was never standardized or widely used. Thus in the same year, when Satoshi was designing Bitcoin, he had to opt for a more standardized and broadly implemented open source signature algorithm, thus ECDSA.
Six more years later, in 2014 the first talk of implementing Schnorr signature on Bitcoin protocol came up in the bitcoin-talk forum. Four more years later Blockstream engineer Pieter Wuille made the first draft proposal for how a schnorr signature scheme in Bitcoin might look like in 2018. In Jan 2020 it finally got the BIP number as BIP340, which describes the full technical definition of Schnorr Signature in Bitcoin. Note that schnorr is still not standardized, and the specific version of the algorithm proposed in BIP340 is tailor-made for Bitcoin’s requirement. Reviews and changes are still going on as the proposal is still pretty much work in progress. Schnorr signature along with Taproot and Tapscript is proposed as a combined upgrade package for the SegWit Version 1 Bitcoin protocol (if you are new to the SegWit versioning system, check out my SegWit article).
Here we will attempt to look under the hood of Schnorr signatures to better understand its promised benefits and to start with, we first delve into the current signature algorithm of Bitcoin, ECDSA.
ECDSA Deep Dive
The purpose of any signature is to prove the “authenticity of intent”. The intent can be a transfer of money, transfer of property ownership, attestation of a statement or any such variety. In the real-world, signatures are usually a unique inscription of individuals that adequately identify their personal intent for a certain action. We try to make our signatures hard to copy, and it usually takes some practice to make a usable physical signature.
In the digital world, digital signatures are different. They are mathematical proof of “knowledge of a number”. A digital signature aims to prove completely to a verifier that whoever has created this signature, has possession of a particular number. As you can guess, this number has to be unique, and anyone who finds out or guess the number can create the corresponding digital signatures. This number, in canonical terms of cryptography, is known as a “Private Key”. Just like in the physical world, where your hand gesture produces a unique signature, in the digital world, the Private Key is used to generate a digital signature that proves the owner of the key has approved the intent. This Private Key has to be kept secret, and it’s owner’s responsibility to keep it hidden and secure. In Bitcoin, for all practical purposes, you should think of your private keys as your bitcoins. Not your key, not your coins.
The idea of digital ownership (among many other things) has started with the concept of Private-Public key cryptography, proposed by Whitfield Diffie and Martin Hellman, in 1976 in this groundbreaking paper. The concept has remained the same, only the algorithm to produce different kinds of magic like encryption-decryption, digital signature, etc has evolved over the years. Bitcoin uses the same concept and every bitcoiners should know about their Private-Public keypairs. Next, we delve in to understand these two items further, before we move into the ECDSA algorithm.
A private key is nothing more than a big random number. Big and random, because we want it to be unique, and we want it to be “hard to guess”. In Bitcoin, a private key is a 256bit integer generated randomly (computer-generated randomness, or toss a coin 256 times). Let’s say the generated random 256-bit private key is
Corresponding to each private key, a public key is computed by multiplying
k with a public elliptic curve point. There's a ton of resources on the internet to understand Elliptic Curves, but for our purpose think of them as a point in a 2D plane. Just like ordinary points, they have two coordinates
y. These coordinates are simple integers, and together they represent a point. Elliptic Curve points have the following important properties:
- if two curve points
Bis added together, it gives a new curve point
A + B = C
- if a curve point
Ais multiplied by the number
n, it gives a new curve point
n*A = B
Throughout the article, we will denote elliptic curve points as upper case letters, and regular numbers as lower case letters. So
A is a curve point and
a is a regular number.
Bitcoin uses the elliptic curve named secp256k1, which defines a publicly known curve point called a generator,
The above represent the actual
y coordinate of the generator point
G for the secp256k1 elliptic curve in hexadecimal representation. It's not important to know what exactly these coordinates are, but enough to know that they are huge numbers.
In Bitcoin, a public key is another curve point
P and is calculated by multiplying the private key
k with the generator point
G. (Curve point property #2)
P = k * G
So anyone, who knows the private key can always derive the corresponding public key. Public keys are used as “account numbers” (publicly known) and private keys are used as “account password” (privately held and kept secret).
Because of the property of Discrete Log Problem, it’s infeasible to compute the private key
k from a public key
P. Specifically divisions like
k = G/P for elliptic curves are computationally infeasible. Thus revealing the public key to the world does not reveal the corresponding private key. This was one crucial innovation of Private-Public key pair cryptography.
With this knowledge at hand, we can move forward to understand Digital Signature.
A digital signature is a mathematical proof of possession of a private key
k without revealing to the world the actual key itself. A signature consists of two numbers
(r,s). Together with the public key
P , anyone can verify that “the signature was created by someone who posses the private key
k corresponding to this public key
P". And the process of verification does not reveal the private key itself.
But just proving possession is not always enough, with a signature we generally also attest to a message/action (signature in a cheque denotes attestation of the action of transferring money). In the case of Bitcoin, this message/action is the transaction itself. So a digital signature also includes a message, which is the hash (double sha256) of the transaction, known as a SigHash. As the transaction is hashed, it reduces to a 256bit number, and we denote it as
the ECDSA algorithm to generate the digital signature is stated as below:
Private key: k (integer)
Public key: P = k * G (curve point)
Message: m (integer)Choose a random number: z (integer)
Calculate: R = z * G (curve point)
X-cordinate of R: r (integer)Calculate: s = (m + r*k)/z (integer)Signature: (r,s) (integer, integer)
the combination of these two numbers
(r, s) together denotes a digital signature. The algorithm stated above is known as ECDSA. Note that the private key
k is required for generating the signature.
In general, a signature function looks as below :
Sig(private key, message) = (r,s)
Thus by changing the private key or the message or both, the signature also changes.
Once the signature is generated, anyone can verify that it was created with possession of the private key
k, corresponding to the public key
P, for attestation of the message
m. The verification process follows as below:
Obtain the public key: P (curve point)
Obtain signature: (r,s) (integer, integer)
Obtain massage: m (integer)Calculate: u = m/s (integer)
Calculate: v = r/s (integer)
Calculate: N = u*G + v*P (curve point)Assert: X-cordinate of N is equal to r
A signature is valid if the above assertion returns
True, else it's invalid. From this simple verification computation, anyone can independently verify the authenticity of the signature and the message being signed. Note, the verifier does not need information regarding the private key in order to perform verification. Thus a successful verification of a signature corresponding to a public key and a message, proves the possession of the private key, without revealing the key itself. This is a very powerful and crucial building block of decentralized financing.
Proof of Correctness
It can be demonstrated easily why the verification algorithm will return
True for a valid signature. This is termed as proof of correctness, and for ECDSA it's pretty straight forward. All the terms have their meaning as per Signing and Verification algorithm above
In signature generation process we have
s = (m + r*k)/z
and signatre = (r,s)where: m = message, k = priavte key, z = random number
r = X-cordinate of R (R = z*G)In verfication process we have
u = m/s v = r/su*G + v*p
= (m/s)*G + (r/s)*P
= (m/s)*G + (r/s)*k*G [as P = k*G]
= ((m + r*K)/s) * G
= z * G
= RThus (u*G+v*P) will always result into R. This can be assertained by comparing the X-cordinate of the result with r value from signature.
This process is what runs inside every bitcoin node for every single transaction. The signature data is recorded in
ScriptSig structure, the message is the full
Tx structure itself, and the public key is encoded in
ScriptPubkey of the previous transaction output. If you are unfamiliar with these structures and their semantic meaning, check out the SegWit article where we discussed this in detail.
This concludes the ECDSA signing and verification algorithm. Even though the process is simple, there are few limitations of ECDSA like non-linearity, signature malleability, etc. These issues do not exist in the case of Schnorr signatures. Schnorr is inherently non-malleable and is linear, which opens up the door of a lot of cool new cryptographic tools in Bitcoin like MuSig, Adopter Signature, Cross-Input signature aggregation, etc. In the next section, we delve deep into the signing and verification algorithm of Schnorr signatures.
Like ECDSA, Schnorr also uses the same private-public key pairs. The only difference is in the signing and verification algorithm, which happens to be a lot simpler than ECDSA.
The signature creation algorithm is shown below:
Private key : k (integer)
Public key : P = k*G (curve point)
Message hash: m (integer)Generate a random number: z (integer)
Calculate: R = z*G (curve point)Calculate: s = z + Hash(r||P||m)*k (integer)where: r = X-coordinate of curve point R
and || denotes binary concatenationSignature = (r, s) (integer, integer)
Verification is also straight forward
obtain the signature: (r,s)
Obtain public key : P
Obtain message: mCalculate the random point R from rVerify: s*G = R + Hash(r||P||m)*P
For a valid signature
s*G will be equal to
R + hash(r||P||m)*P. The proof of correctness for this verification is almost trivial, and left as an exercise for the reader.
The only caveat in the process is
Calculate the random point R from r. As
r is only the x-coordinate of the curve point
R a special consideration is taken into the Schnorr signature proposal to unambiguously obtain any curve point, given only the x-coordinate.
For any elliptic curve point, for a given x-coordinate there exist two valid points on the elliptic curve. The difference between these two points is
- y-coordinates differ in parity (one is even, another is odd)
- y-coordinates differ in the quadratic residue (one has a square root, other does not)
BIP 340 defines a curve point as valid if it has y-coordinate with quadratic residue (square root exists for y). So for a given an x-coordinate, the validator can always unambiguously determine the correct y-coordinate, thus the right curve point. The complete rationale for using such type of implicit y-coordinate is documented in the BIP.
But apart from this small caveat, the overall representation of Schnorr signature is much simpler than ECDSA, and we will see next how simplicity is actually more powerful when it comes to the magical world of cryptography.
Unlike ECDSA, where the signature is calculated as
s = (m + r*k)/z, in schnorr its calculated as
s = z + Hash(r||P||m)*k. It's noticeable that for schnorr there is no division involved in the calculation. This small change make Schnorr signatures linear in nature, and of the generic algebraic form
y = a + b*x. This neat property allows us to add two Schnorr signature, which will produce another valid schnorr signature. In short, now we can do algebra with signatures. This was not possible with ECDSA and it opens a door to more cool cryptographic tricks.
One of the most obvious and easy one is Multi-signatures with Schnorr, which are indistinguishable from regular signatures, known as MuSig.
P = P1 + P2 is the addition of two public keys (remember curve points can be added together),
z = z1 + z2 is the addition of two random numbers chosen by users,
m is the message being signed, then
s = s1 + s2 is a valid signature for aggregate pubic key
P and the common message
Schnorr Multi Sig creation:s1 = z1 + Hashh(r||P||m)*k1s2 = z2 + hash(r||P||m)*k2s = s1 + s2
= (z1 + z2) + hash(r||P||m)*(k1 + k2)Schnorr Multi Sig verification:s*G = (z1 + z2)*G + hash(r||P||m)*(k1 + k2)*G
= z*G + hash(r||P||m)*(P1 + P2) [as P1 = k1*G, P2 = k2*G]
= z*G + hash(r||P||m)*P
Thus the sum of two individual signatures is also valid for the sum of the individual public keys. Harnessing this property, two parties, Alice and Bob can collaborate and add up their public keys to create one aggregate key. Coins from this public key can then be spent by an aggregate signature. Unlike traditional multi-signature which is a smart contract (P2SH), MuSigs are indistinguishable from plain old pay to public key (P2PK) transactions. Only thing that gets recorded onchain is one single public key, and one single signature. Which gives a huge fungibility boost for Bitcoin.
Taproot also uses this property of Schnorr signatures to “add” Tapscripts into regular P2PK types transactions. So with Taproot, there will be no way to tell whether a transaction is simple P2PK or Multisig or some other exotic smart contracts. They will all look the same.
In general this simple property of linearity, which allows us to add “stuff” into the signature is very powerful. Using this we can create Adapter Signatures, which can be used from Cross-Chain Atomic Swaps to lightning/Sidechain channel creation, scriptless scripts and many more.
They all deserve their own explainer blog posts.
We can summarize all the provided benefit of Schnorr signatures as below:
Because of such obvious benefits Schnorr signature is no doubt the most anticipated protocol upgrade in Bitcoin. A seemingly simpler version of doing digital signatures can add a lot of capability and privacy simultaneously in the Bitcoin machine. It took six years for the Bitcoin developer community to come up with a concrete proposal since the first theorization. The work is still very much in progress.
If you feel like contributing to the development work or just to explore these ideas further, check out this Taproot workshop. A few months back a community-wide Taproot review session was arranged, where people pitched in their feedbacks. You can also review the three BIPS (340, 341 and 342)and provide documentation and conceptual feedback.
Ultimately Bitcoin is developed by us, the users, explorers, and conspirators of this amazing and weird idea. So it’s really up to us how and when these protocol upgrades get activated in the network. And when it does, with all the might of Schnorr signature (along with many other crazy ideas), Bitcoin will never be the same anymore.
Even in this bleak reality of the rouge virus and impending economic catastrophe, the future outlook of Bitcoin, as always, remains extremely bullish.
Off to the moon!!