Obituary of a browser based supercomputer

Pedro Fonseca
9 min readSep 22, 2017

--

In 2012, we set out to answer that question “what would happen if we could connect all the web browsers in the world as a distributed supercomputer?” The answer involved genetically modified mice, flowshop optimization, forrest fires, parallelizing Excel, Alzheimers, rendering pokemon balls, searching for the Higgs boson, and Indian porn websites.

Two years ago today, we turned off our HTML5 supercomputer, named “CrowdProcess”. It was also the name of the company, which completed a long pivot towards building the Credit Risk AI, that’s made it the success it is today.

So why did the distributed computer on browsers not work? After all, technically, the idea seemed brilliant: every computer has a lot of computing power. Supercomputers are expensive. Let’s use web workers and create a virtual supercomputer. This was 2012, AWS wasn’t as ubiquitous, and GPUs weren’t the silver bullet for processing that they have turned out to be. We could see millions of websites monetizing through ads, tons of CPU being “wasted” by machines that were just being used to watch kittens. On the other hand there were clearly companies wanting to pay good money for that CPU.

We launched in February 2013 in full Silicon Valley flare. Websites bought our dream in large numbers, and signed up to offer their visitors' computing power, deferring the rewards for when we found product market fit. Curious geeks signed up to use the platform when it became available, and we were able to hire some of the best JavaScript engineers out there. We were going to be the Google Adwords of CPU, as I told a room full of Google Engineers in their HQ in Mountain View, during the hight of our young hubris. They were not particularly scared.

Problems started when our hypothesis of the world first ran into reality. Firstly, most companies who need CPU and have money to pay for it also need privacy, which we couldn’t offer. And yet, we found users that were willing to write the embarrassingly parallel tasks to run on the platform, and went out to look for those who would be willing to pay for the privilege. This is a screenshot from our deck in 2012, predicting we’d run Oil and Gas data, medical research, rendering, and general research tasks on the computers of people looking at kitten websites:

Kittens and their applications

The funny thing is, we ended up 3 for 4 on this slide. We rendered pokeballs with a ray tracing app, making them photo-realistic with a gigantic speedup.

Pokeball rendered with the HTML5 ray tracer

Researchers from the Champalimaud foundation, a cancer research institute, used our platform for tracking images of mice in their experiments, by running frame-by-frame computer vision analysis (for the geeks out there, we transcompiled openCV to do this using emscripten).

A genetically modified mouse runs around, as thousands of nodes compute the position of the mouse at each frame

Our engineers build apps to use Monte Carlo to predict the spread of Forest Fires, and struck partnerships with companies with fire surveillance equipment.

Tens of thousands of compute nodes try to compute the most likely line of advance of a forest fire

Researchers at CERN used our platform for experiments related to finding the Higgs Boson (which, if I’m honest, I didn’t entirely understand, and can only pray they weren’t mining bitcoins instead).

Among the websites that joined were many small independent websites (wave.cat!), but also large news outlets. At one point, over 50% of our traffic came from Indian “male entertainment” websites, which were actively processing computations from the CERN labs. It was technical madness, and the heyday of HTML5 distributed computing. The mission of making computing power available to everyone seemed to be going brilliantly.

There was only one thing all of our “clients” had in common: none of them were willing and able to pay us. They weren’t, therefore, clients. The reason was obvious, in hindsight. If you have enough money to pay, you will use it to pay for privacy. Realizing this, we attempted to correct that problem by making “internal grids”, which would run on all browsers inside an organization. That opened to door to advanced healthcare applications, and we ran sequence alignment tasks for a US hospital, whose researchers wanted to sequence everyone who walked through the door, to ensure that no doctor ever tried to guess diseases from the color of phlegm again. A researcher in Lisbon started using the platform for research in Alzheimers, publicly claiming that he was getting a 150 x speedup by using the platform. We were again bursting with technical pride. And running low on cash.

Having privacy and computing power, we turned to where the money was: finance. We developed a tool for running Monte Carlo simulations at scale. Faced with the fact that our clients wanted their interface to be Excel, we made a plugin for Excel that could run computations on tens of thousands of nodes. The tool was used by private equities to evaluate large corporates, and by banks to stress test their budgets.

During this phase we were so open to all projects (and, to be honest, so over the place) that the emergence of ConvNetJs and conversations with AWS researchers convinced me that we should attempt to be the computing backend for the training machine learning classifiers. After attempting (laughably) to hire the author of ConvNetJs, (who instead took a job as head of AI at a boring startup called Tesla) we launched head first into trying to build this machine learning library ourselves.

Meanwhile, 9000 km away, in Lisbon, my co-founder found a bank that was willing to try advanced machine learning models to predict who would default on loans. We got a stunning 30% reduction in default rates. James, the narrow AI who would 2 years later become the name of the company, was trying to be born. He would later go on to replace the dream of an HTML5 supercomputer.

That success was the precursor to our re-focusing into finance. The team churned at alarming rate as the technological focus and market segment changed, and money ran low. It seemed like a hallucinated obsession on the part of the founders that we were going to build an application in machine learning for credit risk, when tens of other applications had failed. By the time João Jerónimo, the initial mind behind the HTML5 supercomputer announced he was leaving, the team that had once had 12 employees was reduced to 4 people, a fanatical obsession with finding product market fit, and the first draft of James.

James was the precise inverse of the HTML5 supercomputer. It responded to a very clear market need: reducing defaults in credit risk, and making better predictions of who would turn out to be good credit. It was based on solid principles, a strong academic community with a culture for open source, and working directly with clients. Every chance that we got, we worked within the clients' offices, understood their needs, their culture, their concerns. We tested every assumption ahead of time with paper prototypes. Days after he arrived our new CTO asked for permission to unplug the supercomputer for good. I gasped in horror at the idea. Then agreed.

2 years after that, James is not only the product, but the name of the company. It was considered the best European Fintech by Money2020, and boasts among it’s team the ex-CEO of Credit Suisse (board member), the ex-COO of Deutsche Bank, and another 27 highly motivated, dedicated professionals. We are on the way to building the first narrow AI for credit risk, bringing the best of data science into every risk department.

So… what did we learn?

  1. The easy thing to say at this point would be a cliche such as “don’t build startups that are based on a technology looking for a problem”. However, I don’t quite believe that. In fact, I believe the best startups come from founders who have an unusual knowledge of a domain or technology, and who are comfortable crossing over to a new domain. I have learned the hard way that this needs to be coupled with a strategy for finding that problem. Our approach (mainly driven by my own inexperience as CEO) was to try to do everything we got a chance to do, and then see if it brought us product-market fit. Not matched with any strategy for validation this lead us to making huge efforts on hypothesis that had very little chance of success. By being too much in love with the technology, we drastically over extrapolated the commercial opportunity of each hypothesis, and mistook curiosity for commercial interest.
  2. That there is a danger in pre-product-market-fit startups getting involved with late stage VCs. Part of what we saw as validation was the fact that the major Silicon Valley VC firms (NEA, Greylock, Balderton, etc) seemed to have their doors open to us, and not to other people. This is natural: our insane breadth, and the ferocious technical talent of the team meant that we were worth keeping an eye on. I spent many sunny California afternoons discussing the HTML5 supercomputer with the ex-CEO of Mozilla and the ex-CTO of Sun Microsystems (both VCs), mistaking their curiosity for genuine interest, and indirectly misleading my own team into believing they validated our hypothesis. In my inexperience, I mistook that for a clue that we were on the right track, and mistook VC praise for market validation. The reality was quite different: the market was telling us we were interesting, but not that we were commercially viable. None of the Silicon Valley VC firms ended up investing, and I wasted a considerable amount of time on them, instead of chasing the market.
  3. The opportunity that ends up giving you product market fit can come from the most unlikely source. The chance that ended up taking us into credit risk was one that I had at one point given for lost, ready as always to go chase the next shiny new use case. My co-founder João Menano pushed for us to take the banking chance, got a contract signed, and in doing so set us on a path that would lead to today.
  4. Sales should precede product. Our approach was to build prototypes, then go look for clients for them. With an excessively engineering centric approach, many of us considered presenting a deck with something we didn’t still actually have to be intellectually dishonest, so we built everything in order to show it. That was expensive, slow, and lead to a huge amount of wasted prototypes. The reality is that it is the only way to actually approach the market is to pre-sell the idea.
  5. To end on a bit of a cliche, that culture and mission are crucial. Finding product market fit requires having people with thick skin, who are dedicated to the mission. It requires the founders to be able to argue in the best of times, and support each other in the worst of times. It actually requires founders to spend a lot of time (and getting drunk and personally vulnerable) together. It requires telling potential employees up front that they are going into something that has a very small chance of ever succeeding. It requires telling investors upfront what kind of a crazy ride they are in for. We’re deeply lucky that our investors put up with us for more than 2 years before we figured our true calling. We hope they now find their trust in our abilities well placed.

But I digress. This is, as mentioned at the start, an obituary. Not for a failed startup (because James is now a considerable success), but for a failed idea. There were merits in the idea of an HTML5 supercomputer. There was a potentially world changing vision. There were founders, employees and investors crazy and brilliant enough to chase it to every corner, to run around with genetically modified mice, to help Alzheimers researchers, to fight forrest fires, to sequence genomes, to render pokeballs, to parallelize Excel, and, in the end, to pivot away from all that into the fantastic world of James, the Credit Risk AI.

--

--

Pedro Fonseca

Ex-googler, ex-AI critter, figuring out the arts world