A Flash of Insights on Lightning Network
(Cross-post from my blog)
I invented the Lightning Network.
Well, not really. But to the best of my knowledge, I was the first person to write a post describing something similar to how LN is conceived today.
I’m not saying this to take credit for its advent — it’s not like I really did anything. Others have invented the concept of micropayment channels on which my suggestion relied, and I only threw out some rough ideas; I didn’t present a full-fledged design, let alone wrote any code.
I’m saying this to emphasize the point that for some of us, that has always been the vision. One of the first burning questions I had when I was first introduced to Bitcoin was how the whole thing was supposed to scale. Concepts I’ve learned of since then — like SPV and pruning — helped, but I wasn’t completely satisfied. Ever since I heard about channels and thought about how they could be used in a network, that would become one of the first things I would reference whenever discussions of scalability came up.
The ability to use an on-chain transaction to anchor a channel, so that real bitcoins can be sent over it without having to bother the entire network or wait for confirmations for every payment — in such a way that the channel can always be closed unilaterally to recover the funds as normal bitcoins sitting in an address you control — is an idea so powerful that I can’t imagine how can anyone resist falling in love with it.
But the reason I am so excited about the development of LN is not that the vision I had is finally being brought to life.
It is because LN is so much better than what I envisioned back then. Based on the protocol at the time and my understanding of it, I had in mind one-way channels with an expiration date; such channels would have to be constantly closed and reopened, as either the expiration date approached or the finite capacity was used up. If you wanted longer-term channels to reduce the number of on-chain transactions, you would have to lock up lots of funds in advance, and recovery of the funds would be greatly delayed in case of a problem with the counterparty.
But the channels used in LN have two key features:
1. They have no set expiration date. Instead they use the primitive of CheckSequenceVerify to create a differential grace period, for security purposes. But as long as the two parties cooperate, the channel can exist indefinitely.
2. They are bi-directional. Payments can be sent both ways, so that every payment going one way replenishes the ability to send payments the other way. This means that even with a finite amount of funds locked in the channel, an unbounded volume of payments can go through.
Together, these mean that in theory, a user can create a fixed number of channels that will last her a lifetime. She can receive her salary or payments through these channels, and send out payments back again through them, ad infinitum, without ever needing to close channels or create new ones.
If you read my 2012 post, you surely have noticed that what I described was basically a hub-based network, rather than a p2p mesh network. This is another ability that the new features bring to the table.
In the old model, a lot of resources would go into maintaining channels, so you had to make sure each channel counts. Opening a channel to some random user would just not do. You needed channels with large, well-connected hubs you can rely on to be able to route your payments.
But with perpetual bi-directional payments, a p2p network becomes a realistic possibility — every link improves the connectivity of the network and opens up new route possibilities in either direction.
Will the Lightning Network Save Us?
The Lightning Network is extremely powerful, and holds the promise for cheaper and faster payments than any other trustless method, scalable without bound, and potentially encompassing the entire range of payments — $10K for a car, $5 for a coffee, or $0.01 for an online news article alike.
I believe many detractors of LN just don’t understand it well enough, and present criticism based on misconceptions. The goal of this post is to clear up some of the confusion, and to establish a conceptual framework to discuss the LN and argue about its merits.
This doesn’t mean there aren’t real challenges and issues relating to it.
For starters, it is much more complicated than the raw Bitcoin protocol. Complicated things are harder to use and harder to implement. There’s more room for error in implementations and more that can go wrong.
A defining characteristic of LN is that for receiving payments, a node has to be online. This is a marked usability difference, and poses new security challenges.
These security challenges can be mitigated with additional mechanisms… That again add complexity.
Finding payment routes can be challenging even when they exist, and the privacy model is different from normal Bitcoin transactions.
Though I will argue in this post for how LN can work, nobody can guarantee that it actually will work as hoped.
Perhaps by the time LN gains traction, it will no longer be needed. Hardware advances might one day make it feasible to have all payments on-chain; and progress in cryptographic primitives, especially in the area of Zero Knowledge Proofs, might allow new designs that solve the issue of scalability in a more elegant way.
But, in case it wasn’t clear — despite all of the above, I do believe LN is a powerful tool we should have in our arsenal — on its own, or to complement other solutions.
What do we want?
I think everyone will agree that the defining characteristic of Bitcoin that we want to preserve in our discussions of its future direction, is decentralization.
The problem is, that there are many ways in which Bitcoin can become centralized. There can be centralization in network nodes, centralization in mining manufacturers, centralization in mining operators, centralization in development & protocol change decisions, centralization in exchanges, centralization in payment processors, and much more… And with a Lightning Network intact, there can be centralization in LN hubs.
When people argue it is usually because they have different ideas on which of these should be emphasized. Fully analyzing all the ways Bitcoin can go wrong is beyond the scope of this post.
I do however wish to stress that there are many tradeoffs, and often improving one aspect of decentralization harms another. We must strive for an optimal middle ground where all features are adequately satisfied.
By way of example, with plain on-chain scaling, if we set the limit on block size too low, transactions will be expensive and users will have to rely on custodial services and payment processors — a form of centralization. If the limit is too high, running a fully-validating network node can become prohibitively expensive, leading to few nodes and again, centralization. In between there is a sweet spot where neither txs nor nodes are too expensive.
It has been said that LN can create centralization, and in a strict sense that’s true. However, an often-missed point is that, when done right, this centralization can be weaker than what is caused by existing tradeoffs, while at the same time improving these tradeoffs. With LN, you can simultaneously have cheap payments, many network nodes and many LN nodes.
Specifically, let’s consider what it would take for an entire planet of 10 billion people to use Bitcoin as a global currency, with traditional on-chain scaling. Assuming each person does 1000 transactions per year, this translates to 10T tx/year. If each tx is 200 bytes, that would mean 2 petabytes just to store one year’s worth of txs (and even with pruning enabled, ideally more than one year would be stored). To this we must add the costs of bandwidth (about 200TB per month) and CPU.
This suggests that there will not be many nodes; and with few easily detectable nodes, the network is easier to shut down by hackers, governments and so on.
Enter LN — in ideal conditions, the LN might work well with as few as 1 tx per person per year. But to be on the safe side we’ll assume 10 txs per year-person (100B tx/year). This cuts the cost of running a node to a hundredth, ensuring anyone who wants to run a node can do so.
It does mean of course there will be 400MB of tx data per 10 minutes. To this figure we may want to add large transactions which will be done on-chain rather than by utilizing LN; if such large txs average $10K, 10B of those per year will be sufficient for $100T GDP, and increase the “block size” to 440 MB; still two orders of magnitude better than pure on-chain.
The current Bitcoin protocol only supports 4MWU, which translate to about 3MB. So it is clear that the block size will have to be increased — the only questions are when, how and how much.
My ballpark, possibly-off-by-an-order-of-magnitude estimate for the eventual cost of an on-chain transaction is $1. Too expensive to want to buy coffee on-chain; but cheap enough that opening channels isn’t too much of a big deal, and that people can still choose to pay on-chain for larger purchases.
The above estimates for transaction volume assume that payments are commensurable with contemporary payment profiles. But once LN is enabled, it shines in the world of true, sub-cent micropayments; these can be made with virtually no cost or scalability limits, enabling entirely new business models no on-chain arrangement can compete with. The total global payment volume can then grow well above the 10T/year estimation, at no additional cost.
The Hub Model
There are two primary modes LN can work — hub-based or p2p. Both are viable. I will discuss the hub model first, for various reasons:
· It is easier to conceptualize.
· It is closer to the original ideas I had all those years ago.
· A common claim is that a hub model would just be a repeat of the traditional banking system; I want to argue this is far from the truth.
The basic idea is as follows: Throughout the world there will be some number M (say, M=300,000) of hubs that offer LN payment routing as a service. Each of them will have an open channel with all other hubs (which shouldn’t be prohibitively costly for someone running them as a business).
Meanwhile, every end user will have open channels with K such hubs, say K=10. Each such channel will strive to have balanced capacity — where there is as much capacity to send from the user to the hub as the other way around.
When Alice wants to pay Bob, the payment will be routed through 3 hops: from A to a hub X which A is connected to; from X to a hub Y which B is connected to; and from Y to B. Such a 3-hop payment can be done very quickly and cheaply. Since A is connected to 10 hubs and so is B, it will be easy to find a path even if some hubs are temporarily down.
Each of A’s channels has finite capacity, and it is a common misconception that once a channel is depleted, she will have to create a new one by paying for an on-chain tx.
However, the channels are bi-directional. A can’t create money out of thin air (this is Bitcoin we’re talking about, not government money), so she doesn’t only pay people, she also receives payment — and she does this through her channels as well. Every payment she receives turns back the clock on these channels, and allows them to be used again.
In a typical scenario of an employee who receives a salary every month and spends it throughout the month, it will look as follows: At the start of the month, Alice will have almost no outgoing capacity in her channels; all funds locked in the channels will belong to the hubs she is connected to (which is equivalent to saying she starts the month with barely any money to spend). Then she receives a salary; this is routed through her different channels, replenishing them all. And now she has outgoing capacity to pay for stuff throughout the month — the same amount she received as a salary.
Of course, if she doesn’t want to live “hand to mouth” so to speak, and have some leeway in monthly spending that does not exactly match the monthly revenue, she can simply have channels with total capacity higher than her salary, to act as a buffer.
Realistically, Alice will want to save up some of her salary (say, 10%) on a timescale of years. If her savings are in other assets, such as gold or stocks, which we pays for through the channels, this is no different from paying for anything else. But if she saves up bitcoins — then even if she has extra capacity in the channels, over time they will fill up. If she goes through every month with 10 mBTC going in but only 9 mBTC going out, then the balance of the channels will be off, with more funds credited to her side of the channel.
Eventually, she will start a month without any incoming capacity in her channels, and she cannot receive her salary through them. At that point, she must create a tx to collect the funds in a non-LN address and open up a new channel with incoming capacity.
When she does close a channel for long-term storage, she can make sure to make the most out of it; if there is a non-zero, but insufficient, incoming capacity on it, she can route a payment to herself in a triangle, so that another channel is replenished and the entire capacity of the closed channel is collected.
With K=10 channels at any time, 10% savings, and total channel capacity equal to her monthly salary, about 1 channel will have to be reopened per month. This can be reduced in direct proportion to the amount of funds locked.
Going back to the overall network structure, the balance of channels between hubs also needs to be considered. If, for example, A continually sends payments to B routed through hubs X and Y; and many other users also route through the link X->Y, it can happen that the channel will be saturated, with no more capacity from X to Y.
This is solved with fee adjustment and market forces in route selection. As long as the channel between X and Y is balanced, X and Y will charge normal service fees for routing. As the direction X->Y is saturated, they will charge higher fees for routing this way, encouraging users to choose other paths over it. Additionally, they will charge lower fees for routing in the other direction, encouraging users to route this way and replenish the channel. At extreme saturation levels, they can go as far as charging negative fees for the reverse direction, effectively rewarding users for replenishing the channel. When such a strategy is combined with users who wish to minimize fees for their payments, and who have many routing options due to good network connectivity, the invisible hand plays its part and ensures the network is balanced and functional.
I have chosen the number M=300K so that, together with the parameters N=10B users and K=10 channels per user, the total number of user-hub channels will be equal to the total number of hub-hub channels, which balances the overall costs.
However, if we consider the number of hubs to be too low, or the cost to set one up too high, we can switch over to discussing a 4-hop model: We can have 4 million hubs, and each one is connected to 4000 others. Properly arranged, there can always be found an intermediary between any two hubs.
Thus, whenever Alice wants to pay Bob, A chooses a hub X she is connected to, B chooses a hub Z he is connected to, and an intermediary Y between X and Z is found. Then the payment is routed A -> X -> Y -> Z -> B.
Such a 4-hop path can be slightly more time consuming and costly than 3-hops; but this network topology allows a much larger number of hubs, and a smaller cost of starting one.
Note that even though we assume only one intermediary between any two given nodes, there is still plenty of flexibility in choosing a path in case of any issue, due to the choice in X and Z.
It should go without saying that the assumption that all nodes are connected (directly in the 3-hop model, through an intermediary in the 4-hop model) isn’t strictly necessary. Even if some links are missing, you can find alternative paths. That is the strength of a robust, well-connected network.
Which leads us to the infamous question…
Are LN hubs the same as banks?
I hope by now it’s clear the answer is — no, not at all.
The most important difference in my view is that, unlike banks, LN hubs don’t hold your money. Your money is stored in a channel, anchored to the Bitcoin blockchain, and only you can authorize its movement. In any case that the channel counterparty refuses to cooperate, due to malice or incompetence, you can unilaterally close the channel and receive the money back as normal bitcoins.
This ability needn’t be exercised to be useful. Simply knowing that you can easily quit encourages the counterparty to behave and offer a good service. And it alone means that even in the extreme case of a completely centralized lightning “network” with a single hub connected to everyone, this is a marked improvement over traditional banking.
Furthermore, running an LN hub is much easier and cheaper than running a bank. As we have seen, we can have million of hubs that cost a few $K each to set up.
This means that the LN doesn’t suffer from the two problems that are the bane of competitiveness — barrier of entry and vendor lock-in. Together, this means that fees for using the LN can be minimal, approaching the very modest cost of the resources consumed.
There is also the issue of privacy. Lightning hubs you are connected to will know more about your financial activity than a generic node in the traditional setting. However, they still know very little; they can learn the public key of the receiving client, but not their real-world identity or reason for payment. Also, other nodes will know much less than in the traditional model, as you do not publicly broadcast every payment. This is a different privacy model than standard Bitcoin, but since it limits the amount of parties having any information about your payments, it is arguably a better one.
As for censorship, this is more censorship-resistant than on-chain scaling. With a large number of LN hubs and network nodes (as opposed to the small number of nodes with on-chain scaling), it is difficult to coax them into censoring transactions.
So how much will it cost?
The end-user cost for paying with LN will reflect the costs of the hubs that run it. It is of course difficult to predict the exact costs of the system, but I will try to give some rough figures as a basis for discussion. The total cost of all hubs, in the 4-hop model, can be broken down as follows:
1. Running nodes: A fully-validating Bitcoin network node isn’t strictly needed for running an LN hub, and running a node shouldn’t be very expensive anyway (the whole point of LN was to reduce this cost), so we will ignore it.
2. Processing payments and routing: Since each LN payment only involves the LN nodes on its path (as opposed to raw Bitcoin transactions, which involve all nodes on the network), processing each payment requires only a small number of machines sending data and signing, hence the cost is negligible.
3. Setup and operation: Negligible, you basically just download, install and run a software.
4. Creating channels: Each hub creates 4000 channels to other hubs, and there’s no reason they shouldn’t last for, say, 10 years. There are 4M hubs, and we have assumed each channel open costs $1, so this is a total of ($1*4M*4K/10yr) = $1.6B per year.
5. Time value of funds locked in channels: This is hard to calculate. With proper fee adjustment, I believe the LN can work well with each intra-hub channel having capacity equal to a tenth of an average person’s salary. This would mean that, Alice can pay Bob a total of 5 months salary before saturating all routes between them (if she wants more, she can always pay on-chain). Note that this would not saturate the intermediate channels between Alice and anyone else, though it might saturate Alice’s own channels. If the average salary is $1000, this means a total of ($1000*4M*4K/10) = $1.6T locked. Since Bitcoin is non-inflationary, the time-value of money is low; we will use a generous estimate of 5%/yr for the time value and cost of security. This gives a cost of $80B per year, much more significant than setting up channels.
Together, this suggests a cost of less than $100B per year for all hubs. At 10T payments per year, this means that the average cost per payment will be about 1 cent.
Note that this is average, not minimal. The main costs for running hubs are opening channels (which are depleted in proportion to amount of funds paid) and locking funds. Accordingly, the fees they will charge are proportional to the amount sent. This means that the average fee of all payments, small and large will be 1 cent, but for small payments the fee will be much lower, with a minimum of perhaps a hundredth or a thousandth of a cent. This allows true micropayments, as well as payments in poor countries for which 1 cent tx fee might be prohibitive, but the typical payment is accordingly small and cheap.
I did not include in the above calculation the costs borne by users to set up their own channels, but these are comparable.
This is all just about how cheap LN payments are. It’s worth pointing out again how fast they are. Unlike on-chain txs which require confirmations in a block before they are considered secure, LN relies on already-confirmed channel-opening txs on the blockchain; and thus each payment is completely confirmed and secured nearly instantly. This, perhaps, is a bigger deal (and a harder feature to provide with other means) than simply scalability and tx cost.
The p2p model
I have written at length about how the LN can work in a hub model while still remaining decentralized. But perhaps a more interesting vision is a true p2p mesh network.
I have discussed 3-hop and 4-hop hub models. If we take it to the extreme, we can examine what it would take to have any two users connected by up to T hops. If we engineer the connections correctly, we can minimize the number of channels required for each user. Finding exact bounds is an open research problem in graph theory, but for T=10 hops, each of the N=10B users needs to be connected to K=15 others.
Of course, such a carefully engineered network might not be considered truly peer to peer. So we may ask about a completely random network, with a total of NK channels which need not even be evenly distributed between users. It turns out that if K is greater than half the natural logarithm of N (in our case, K should be at least 12), then almost surely the graph will be connected.
This result says nothing about the number of hops required. At the threshold K=12 this can be quite large. However, by increasing K we can decrease T. Most research on random graphs gives asymptotic limits, and numerical simulations for a 10B-node network are difficult; but by extrapolating from scaled down versions, I estimate that with K=20, any pair of users will be connected with 8 hops on average — and with many possible paths. Increasing T leads to exponentially many more paths. By allowing, for example, 12 hops, there can be millions of alternative paths between every pair of users.
And this is all assuming a completely random network. In practice, it will not be really random — each user has more economic activity with some peers than others (e.g. peers in the same geography, industry etc.), and will focus on activity with such favorites. The economy in such sub-networks can be done with fewer hops. Also, there is no need to fear the occasional distant peer — if a user needs to send payment to someone on the other side of the network, he can simply spend an on-chain tx to open up a new channel, and have easier access to this remote part of the network for future payments.
Since the network will not be random, but rather built organically based on actual needs, it will have much better performance numbers than predicted by the random graph model.
So users are connected by paths. But is there capacity along these paths in the required direction?
The analysis of the hub model applies here as well. Users will strive to have their channels balanced, and will adjust fees to modulate this. This means that whenever a payment is routed, then among the many possible paths, one will be chosen that replenishes as many as the channels as possible.
On the individual level, a user is expected to receive as much payment value as he sends, so his channels are expected to be balanced. Any large movement of funds, or correcting any imbalance that can result from savings, can be done with an occasional on-chain tx.
So which one is it? Will we have hubs, or a p2p network?
Well, why don’t we have both?
There will be hubs, and users will connect to them. And users will also connect to their peers — friends and business partners. It’s probable that most payment volume will use short routes through hubs. But the interconnectedness of users and their peers will make sure the network stays robust even if something should happen to these hubs.
Finding the best routes in this jungle is up to the client software, and this is not necessarily easy — but if routing on the internet works, I see no reason the LN shouldn’t work as well.
I have focused on analyzing a future scenario where everyone uses LN. Of course, this can be decades away. It’s not clear how exactly we will get to this state, but it will surely involve gradual, steady, organic growth. Starting with early adopters connecting to their friends and service providers, and creating new channels whenever the need arises, the network will be built up and offer more and more value.
Like a flash of lightning, the future is bright.