Fully Decentralized Lending & Blockchain as Data Strategy

Paul Golding
18 min readAug 15, 2018

Blockchain Going Mainstream, Almost, Not quite…

The story of blockchain continues to evolve rapidly. The recent Deloitte Global Blockchain Survey (2018), whilst confirming that 44% of US execs remain skeptical, highlights a significant swing towards investments beyond mere experimentation or proof-of-concept forays. Indeed, the survey confirms my own experiences that proof-of-concept projects are often a bad idea, or a bad habit (more later). The better approach is sustainable investment into what is clearly a foundational technology, and we are beginning to see such investments by mainstream incumbents.

Source: Deloitte Global Blockchain Survey (2018)

The pace of the unfolding story continues to stagger. As I keep telling anyone who will listen, the pace of innovation with blockchain is seemingly 10x what we saw with the last foundational wave, say mobile. I don’t have concrete data to back it up, but my own research into Github commits and patent filings confirms a high velocity. The point being that anyone waiting to see where it’s headed is probably going to leave it too late to reap competitive advantages. Indeed, the days of wait-and-see are over if we are to believe the new mantra of perpetual digital transformation that has emerged from the respectable IBM C-Suite annual study.

In my own efforts leading blockchain and AI R&D initiatives for an online lender, I have done work to establish a suite of protocols that could, in theory (and in practice, although there is work to be done) move online lending to a radically (fully) decentralized and automated model with near instantaneous loan approvals. And I don’t mean mere recording of loan transactions on-chain and subsequent securitization — I mean the whole shebang. Indeed, we are seeing such claims from the recently formed Figure, notwithstanding an absence of any technical details. Going further, I believe that the future is smart money whereby funds could move automatically from regions of low efficiency (of capital) to regions of high efficiency and the concept of “closing loans” becomes a distant memory.

Here I will elaborate upon some of the current challenges and opportunities, along with some comments about the wider environment for innovating with blockchain, mostly by thinking strategically about data rather than contemplating blockchain adoption in a vacuum. This is not going to be a technical description of lending protocols, but I will return to them later on.

There’s “Data” and then there’s “Data”…

Before I get into the specifics of online lending in order to elucidate some specifics about blockchain innovation, let’s explore the broader innovation context for blockchain. I am a technologist, with many patents, who still develops algorithms and code, but most of this article will be within the rubric of business and strategy. As someone who typically gets asked to head up “innovation labs” within existing orgs, I have found that a good chunk of my job is dealing with the cognitive and organizational challenges of applied innovation. Blockchain is no exception.

When I was interviewed recently by the innovation guru Stephen Shapiro for an Inc.com podcast, he asked me about how I might convince execs (those 44%) about the opportunities in blockchain and get beyond the hype or skepticism. To quote Sean Connery (via Andy Garcia) when commenting upon his laid back attitude on the set of Untouchables: “This ain’t my first barbecue kid!”

My interview with innovation guru Stephen Shapiro

In other words, convincing execs about any transformational or foundational technology is nothing new and is no different for blockchain. The same rules apply, namely that you cannot assess new technologies using the old rules. This simple rule continues to fool all of us, and oddly for the same reasons many of us inadequately manage personal finances: namely an inconsistent view of opportunities based upon their current versus future timing (“hyperbolic time discounting”). I mention this because I will return to it as a key motivation for why smart money will eventually dominate financial systems.

Returning to conversations about blockchain with execs, the best way I have found to explore the subject of blockchain is to shift the locus of enquiry to data. Although we now live in a world where there are (disputed) job titles like data science, it seems that understanding data as a strategic asset (esp. within the rubric of the actual scientific method) remains elusive.

What do I mean by data as a strategic asset?

Well, let’s begin with the remark that is has been well established that even within companies that claim to be “data driven” the reality is that data is often treated as a by-product of core processes as if the processes come first and the data second. This is back to front.

Indeed, anyone paying attention to AI would have noticed that the quasi-magical possibilities of AI are contingent on having large amounts of good quality data. Peter Norvig’s exposition about the unreasonable effectiveness of data should have really been about the unreasonable effectiveness of data at Google, where they have lots of it. There’s Google, then there’s the rest of us. (And blockchain could have the answer here — read on.)

The rest of us have what often amounts to insubstantial datasets that are resilient to unexpected queries and poorly maintained, never mind labelled for subsequent AI purposes. This even happens (I would almost say always) in incumbents that hitherto think they are fairly data-savvy. Thanks to the data challenges of AI, we’re beginning to understand that there is a huge gulf between companies that are data rich and those that are actually data-driven. The data challenges are so widespread that it’s even causing data scientists to leave their jobs and seek solace elsewhere.

So much for Data Science

It reminds me of Martin Fowler’s lament about continuous integration wherein he would purportedly visit a company and challenge their claim to have already implemented CI by saying: “Okay, let’s push a release right now.” Needless to say, many could not.

The same is true of data. It’s easy to test by saying: “Okay, let’s query dataset X joined with (previously unjoined) dataset Y for the last 12 months” and watch the DB-admin’s face turn pale (or red) as he or she thinks about how to build the pipeline, or simply declares it impossible, perhaps even if they had the foresight to use Hadoop.

Jim:“Hmm. We can do the first one, I think — right Bob?” Bob:“Huh?”

Or try asking what certain column variables mean and watch as a DB-admin or an analyst tries to remember, but never refers to a data catalog (that doesn’t exist anyway). Don’t even attempt to ask if certain datasets are still current or have been rendered useless by failing to store historically contingent records — e.g. think of a product pricing catalog that lacks the historical record of applied discounts and so no longer portrays an accurate historical account of user price-driven behaviors.

Upon analysis, many of these data-rich companies have lots of what IBM calls “dark data” that, once stored, never finds a use again. Worse still, it cannot be used again due to ill-planned storage strategies to begin with.

Bob:“Yes — we have lots of data, you just can’t see it.”

The heart of the data-readiness problem is typically governance, or lack of it, meaning the absence of a systematic way of treating the data as a structured valuable asset that has both current value and future (often unknown) value. This is seldom the treatment that data receives.

My analogy is to think of opening an old notebook that you might have lying upon a shelf. Open any page and although you can somewhat make sense of the data, the context is missing to extract value in the (now) future moment: What was I doing? Why did I write this? Is this still relevant? Without this contextual data, the core data has lost its value over time. That’s hardly the definition of a valuable asset.

Put simply, if you’re treating your data in such a way, then you’re not treating it as an asset. It’s like storing flour as bread rather than as flour (that can be repurposed in the future to make things besides bread).

And, many corps who claim to have a data lake (which actually isn’t a technology, but rather a set of practices, or a “reference architecture”) have a data swamp.

Blockchain as Data

Returning to the blockchain-as-data strategic framing, it is the case that at the core of most businesses the key activity is the orchestration and marshaling of data and making sense of it. The efficacy of this process is what eventually drives value. Even with businesses whose core activity is ostensibly handling physical goods (not digital ones), we see that the handling of these goods is almost ancillary to the handling of their digital meta-data. Think of Uber as a digital logistics engine that happens to orchestrate vehicles, drivers and passengers beholden to the dictates (or availability) of data.

Indeed, this organization-as-data world view is coming to the fore with new applications of AI, namely the simulation of entire businesses as if they are a game wherein company assets (like people) are modeled as agents in the game. Platforms like Telos.ai are moving in this direction.

In many businesses, including lending, there are really only going to be two key levers to increase data efficacy: AI and Blockchain.

Business Efficiency Driven by AI and Blockchain

Here I am showing these two forces as drivers of “digital transformation” over time, meaning the movement of more and more processes to natively digital methods that can extract value from available enterprise data. (Extracting value from data should be a key goal of any digital transformation program.)

Given the state of maturity of AI relative to blockchain, I have shown AI first, but the order doesn’t matter. The reason for the asymptotic nature of both technologies is that they are ultimately fueled by data, or information. Once the theoretical information limit has been reached (at some sensible economic scale) there is no other means to improve performance because there is no more information to act upon (e.g. train an AI or validate a proof on a blockchain).

[Side note: I seldom meet data folks who understand the difference between information and data, even in a non-theoretical framing applied to analytics say. Data strategies should really be driven by information strategies.]

Seen this way, we can see how the fuel of future value in a digital enterprise is data. Once its information content runs out, there is nowhere left to go. The name of the game, for those that haven’t noticed, is to think and act data at all levels of the organization. Whereas we are trained to think of how to capture markets, it is perhaps better, or at least wise, to ask how to capture data. You can even think of business models like Uber as essentially doing just this. The current data is who has a car and where is the passenger and where does she want to go. If you can ask and answer these questions at scale, you can extract value from the information. It is easy to imagine how augmentations to this data could significantly boost its future value.

Indeed, if we took an information theoretic point of view of markets, then we can almost say that there is no other asset besides data (or information) given how far we can now orchestrate the physical world using IoT and related technologies.

But why and how does blockchain play a role here?

The reason might become clearer if we consider the following equation that I have formulated for data-driven efficiency during some of my work:

Data Efficiency Equation

Note that the dot here means dot-product, not multiplication (where the product is greater the more aligned the two sides of the product are in terms of direction). Decentralized technologies, when properly designed and orchestrated can improve this equation in loosely the following ways:

  1. Blockchain offers increased data velocity due to an “in-place architecture” which means we more efficiently “bring process to the data” rather than the data to the process. If the chain is utilized properly, pre-verified data is already waiting (“lying in wait”) to have its value extracted, often in nearly zero time.
  2. Blockchain (along with AI, but we shall discuss such combinations later) brings about process velocity due to smart contract “process pre-emption” — i.e. things can happen on-chain the second that the prerequisite data inputs (via events) become available. Orchestration across multiple actors is automatic.
  3. Blockchain affords increased orchestration efficiency (the dot product part) due to crypto-economic “data alignment” incentives — i.e. if properly configured as an ecosystem, the actors in the chain are incentivized to produce results (for everyone) in a timely fashion (to reap rewards). This is a more nuanced point about crypto-incentives that needs more discussion, but let’s put it aside for now.
  4. Decreased loss due to distributed ledger benefits — e.g. the data is “pre-verified” (perhaps elsewhere). What I mean by “the loss function” is some key data-associated loss that we are trying to minimize at the fundamental business level. In the case of lending, the loss function is risk (of bad loans). We can see how risk is probabilistically determined by timely and accurate information (e.g. credit-worthiness, earnings, debt-to-income ratio etc.) Risk is actually a function of information. Losses in other use cases might include patient malpractice in medicine or misplaced inventory in supply chain logistics. These are all functions of information. As such, if we can ensure better veracity and velocity of the contingent data, we can minimize the loss function.

Of course, it might be easy to read the above list and insert the word “database” instead of “blockchain.” But this would overlook the key aspect of decentralization wherein all the necessary actors to make the above equation work (in today’s hyperconnected economies) can reliably transact via a single shared data fabric without the otherwise significant overheads of ensuring trust, compliance and data veracity, to name a few.

If we return to lending as our example for illustration purposes, then a key part of the information universe is a user’s credit worthiness. Credit worthiness is highly centralized via credit agencies. However, the data they use is aggregated from elsewhere and could be immediately available to lenders had the original inputs been stored on-chain.

Credit Score: courtesy of your highly centralized agency.

Let’s take wages as an example (although not strictly part of the credit rating process). If my wage slips are already stored on-chain via a trusted data provider (like a payroll processor) then a lender could consume this data immediately without any further verification steps: my pay would be cryptographically “attested” via the payroll processor. (Trusting the payroll processor is another matter, but one that is easily solvable with a variety of decentralized or quasi-decentralized mechanisms.)

Moreover, if these kinds of datasets are available in a shared data fabric, then we should expect that innovators will be incentivized to augment the data, such as providing supplementary services to help calculate a more nuanced or novel credit rating using ancillary data sources. It is difficult to build such models when the data is locked behind the paywalls of centralized agencies. We shall skip any discussion of the possible modes of incentives for innovators that are possible via crypto-token economies.

If we use the two levers of AI and blockchain to map the future of digitally-extracted value, then we can use a standard quadrant metaphor, here shown for lending:

Future of Lending in an AI and Blockchain World

Banks typically sit in the lower left quadrant where they are yet to implement any meaningful digital transformation, even though the bigger banks are increasingly investing in technologies like AI.

Online lenders are, by default, more efficient out of the gate because they are natively digital and rely heavily on agile digital orchestration to achieve velocity using modern technology stacks (and APIs). Currently, little of this advantage is due to AI (directly) and practically none of it due to blockchain. Much of it is due to more efficient use of existing digital tools to orchestrate services and data.

Clearly, such an advantage is only as sustainable as the inability of banks to catch up via digital transformation. But online lenders, or other innovators, should not be content with such an advantage. The upper-right quadrant is far more enticing and valuable in the long run, so let us review what it means to get there.

Blockchain, AI and Smart Money

The real opportunities in digital finance lie in the top right quadrant where the methods offered by AI and blockchain enable entirely new modalities of competition. In this upper quadrant it is possible to imagine the future of money as something radically different from today’s world.

A sad fact of today’s financial world is that the management of financial assets and financial wellbeing is woefully inefficient. Many of us fail to spend, save and invest properly, still guided in our decisions (or mostly lack of them) that are vulnerable to incredibly ill-equipped cognitive apparatus.

Let’s face it, most of us procrastinate about investment in pensions (those who can afford to save) and a disproportionate amount of people of “above average intelligence” do seemingly dumb things like subscribe to annual memberships (e.g. gyms) that prove to be uneconomical (fooled by what the economists call hyperbolic time discounting).

There are solutions to all of these problems. Let’s explore one of them.

Let me return to where I started, namely my R&D interest in digital lending. What I discovered in my work with crypto-protocols is that it is possible to radically and fully decentralize lending. What do I mean by this?

I do not mean merely putting loans on-chain, like taking the promissory note (the “end-user agreement”) and storing it (and its associated data) on a blockchain via a cryptographic hash. This might be useful for limited use cases like avoiding double-spending of loan funds and ensuring a kind of audit trail. It might also be useful for pooling of loans into securities. But lending is a much more complicated business that involves myriad processes, like verification (inc. credit worthiness), servicing, investor return management and so on.

I approach the problem by treating the loan as an instrument that can stand alone on the blockchain, meaning that there are supporting protocols capable of attesting to the origination of the loan, the servicing of the loan, the payment history, and so on, all in a completely crypto-secured (Byzantine Fault Tolerant) environment.

This means that, in theory, anyone could originate a loan, fractionalize a loan (resell all or part of it), pay it down, and so on. Their ability to do so and make claims (like being an authentic lender) are all subject to cryptographic proof mechanisms (that lay atop of the underlying chain-proof mechanisms) — i.e. think of how Filecoin has developed protocols like Proof of Spacetime to prove that files have actually been stored.

Broadly speaking, this would require a suite of protocols something like the following (for illustration purposes only):

  1. Origination protocols — i.e. putting the loan on-chain and making it a transactable object (forkable, poolable) and so on.
  2. Verification protocols — i.e. KYC compliance, credit checks, distributed credit scores, payroll history, tax filings etc.
  3. Servicing protocols — i.e. payment channels, payment histories, reputation etc.
  4. Trust protocols — i.e. explicit trust mechanisms for all loan objects and actors in the network, such as how do we know a loan is real (as opposed to a loan transaction being real). One mechanism is proof of stake — i.e. put your money where your mouth is: if you post a loan on the network, then stake its returns (within certain probabilistic constraints).
  5. Payment protocols — i.e. to enable the payment channels to be coupled with loans, track remittances, return investments to investors etc.

Of course, this is all subject to regulatory constraints, but just as with any transformational technology we expect the regulations to be stressed, tested, revised, and so on — i.e. to evolve.

The actual details of the above protocols are left out and are deliberately vague in my list because the details do not matter for this discussion. If we had time, we could survey them, but the claim here is that a digital financial landscape can be constructed, thanks to blockchain, in which the movement of money (in this case for lending purposes) is 100% automated. The technological basis for this claim is the shared availability of cryptographically proven information.

Put simply, if a borrower needs money, it just happens in the right way and at the right time because their need for money is available and their credit health is also available. Indeed, this process would happen preemptively and automatically, which is to say that there is no longer a meaningful concept of time to close a loan: there is no closing.

The blocker to this vision is lack of information.

Let’s put it another way. Right now there are people whose financial wellbeing is non-optimal due to debt. And then there is money sitting somewhere in other people’s accounts whose financial wellbeing is also non-optimal due to lack of proper investment. In one version of a decentralized world, the one that I have researched, money could just flow between these two groups in a manner that is highly optimized. Think of the money in your wallet (or savings account) as having “awareness” of where it could otherwise be stored in order to increase its value or utility in some probabilistic sense over some time period(s).

My point is that one (major) reason this doesn’t (cannot) occur today is lack of reliable information exchange due to the highly centralized history of financial systems that belong to special interest groups whose initial (and continuing) authority is based upon their claim to trustworthiness. Money is really data and that data is impossible to innovate with whilst it is locked in centralized mechanisms. The key to unlocking is to make trust a computable, or technological, object rather than an institutional one. This is exactly what blockchain offers.

If this is sounding too fanciful, then let us remind ourselves of some features of Bitcoin. Two key claims stand out:

  1. It is a monetary system that has no central agency or controller. The system kind of works “by itself”. This is because the information needed to transact is equally available to all parties (which can only really happen because there is no need to have a conferred source of trust). No one applies (or is verified) in order to engage with the system.
  2. It is the world’s biggest computer with no central agency or controller (unlike, say, Google’s data center goliath). (Similarly, Filecoin, even by conservative estimates, could ultimately become the world’s biggest decentralized store, bigger than Amazon S3.)

There is no technical reason why any monetary system couldn’t be configured in the same fashion. It is not necessary to have agencies who “decide” who should get a loan and who shouldn’t. In reality, their agents don’t decide anything. They follow rules that are orchestrated via IT systems. The logical conclusion then is that “money” in a decentralized data fabric simply “decides for itself” given that its flow is really governed by information, not by people (i.e. agents/bankers).

As for the world’s biggest computer, then in the case of loans it seems obvious that the world’s biggest source of loan capital could easily be a decentralized pool of money. There is no technical reason for this not to be the case.

Of course, the governance of these “smart money” decentralized flows is another issue, namely that the actual actors (e.g. borrowers and lenders) presumably ought to have some agency. For example, the money in my wallet shouldn’t just march off one day into the hands of a borrower if that isn’t something I want to happen. However, perhaps some of my money should “march off” by itself if it comes back one day with some additional returns. How do we decide when and how it should “march off” and “return”?

This is where AI comes into the configuration and is perhaps the most exciting and largely untapped potential of blockchain-powered ecosystems.

None of us can tell the future. This is particularly problematic for humans because we instead use heuristics to make future-looking decisions and these heuristics, as is now well documented (e.g. by behavioral economists) are woefully haphazard.

As humans, we make moves, like in a game of chess, loosely within some set of rules (e.g. like the rules of chess) but often without any predetermined game plan. We hope that we are optimizing for an eventual “win” of some kind, although we are usually distracted from winning by short term gains (which would amount to taking a few pieces to get the satisfaction of “winning”).

My use of the game metaphor is not accidental. This is precisely where the emergent field of AI methods like reinforcement learning are starting to show signs of promise.

But let me return to the main pillar of this discussion: innovation with blockchain as approached via the framing or rubric of data-as-strategy.

It is easy to predict the future of smart money in the simplistic form outlined above. The details may well be wrong, but that is not the point. The point is that in thinking about the future of money in this way, or any other business value system, the real value is in capturing data and then re-imaging its value in an open cryptographically-proven environment. In other words, companies who want to dominate markets by occupying that upper-right quadrant should be asking themselves where and what is the data that will enable them to do so. That is how to skate to where the puck is going.