Powering the wisdom of the crowd

How a handful of people made large-scale citizen science easy for everyone

Every morning in a Madrid suburb, after getting his son to playschool, Daniel Lombraña González opens his laptop and logs into crowdcrafting.com, the website he runs. For five years he’s watched its users grow from tens to thousands to tens of thousands. And these aren’t users sharing cat photos. Every user is a volunteer in one of humankind’s greatest scientific advances: massive, collaborative research, powered by humans solving problems that computers can’t.

Volunteers are becoming a bigger and bigger part of scientific research, turning their cognitive surplus into world-changing contributions. They’re tackling enormous challenges: Cancer Research UK uses volunteers to distinguish cancer cells from healthy ones; and volunteers in Japan and elsewhere carrying personal Geiger counters have helped assemble the world’s biggest database of radiation levels.

Finding the right problems

The technology and people behind volunteer thinking, and behind Crowdcrafting in particular, are among the triumphs of open-source computing. Behind Lombraña’s slick website are layers upon layers of technical wizardry, knitted together like a complex, endlessly shifting game of Mahjong. To most of us, that’s an invisible world, the black-box Internet we’ve come to take for granted. To a geek, though, every layer tells a story of puzzle-solving, late nights, fierce debates with distant team mates, and a constant search for cash to keep the servers running.

For three years, most of the cash that keeps the site up and pays for its team has come from a small organisation based in the hills outside Cape Town, South Africa. Founded in 2001 by technology entrepreneur Mark Shuttleworth, the Shuttleworth Foundation covers the expenses of about a dozen change-makers while they fix dents in the world. It’s one of a small but growing number of funders who’re reimagining social-impact investing: the aim is to invest not in a clever project, but in a very special kind of person.

When Lombraña was 10 years old, he was already learning to code. By 18, he was compiling Linux on his own machines. And in his twenties, for his PhD, he was researching a technique that evolves computer programs using evolutionary algorithms.

There is a particular magic in building machines that do our work for us, but the problem, says Lombraña, is that if we love the machines more than what we can do with them, we spend precious energy solving unimportant problems. “In computer engineering, most problems are fake ones. We’re wasting our time,” he says. Real problems kill people. Those are the ones to solve, and computers can’t do it alone.

In 2006, as a student in search of bigger, more urgent problems, Lombraña visited CERN, where he met Francois Grey and Ben Segal, two scientists working on ways to get volunteers to donate some of their home computer’s processing power to big scientific projects. To do this, they were using software called BOINC. For scientists, volunteer computing is extremely useful: if you need to process huge amounts of data, instead of investing in a supercomputer you spread the work among thousands of idling home computers around the world, connected over the Internet.

Lombraña was intrigued, and began poking around in the code to see what it could do, and how he could improve it. It was hugely exciting. But a grid of volunteer computers was still, really, just a bigger computer. He knew there was something missing: the element of creative lateral-thinking that humans bring to problem-solving. The next step was obvious, but it would be a little harder to achieve: not volunteer computing, but volunteer thinking.

By 2010, with his PhD behind him, Lombraña was keen to focus on that. “I didn’t feel the [PhD] research I was doing was useful to anyone,” he explains. “So I called up Francois and said, hey, I’m looking for a job”. As it happened, Grey had an idea.


Perfect timing

Meanwhile, in an English train station a few hundred miles away, the seed of their future funding was being planted.

“I remember being in my car at the station, on the phone with David Jones, waiting for Mark Surman to get off the train,” says Helen Turvey, CEO of the Shuttleworth Foundation. Surman was the Executive Director of the Mozilla Foundation, one of the most influential open-source-software organisations in the world, and a leader in organising large groups of volunteer thinkers. David Jones was working on ways to make climate science more trustworthy by making the code and data behind it more transparent and accessible. They had a lot to talk about.

“It was at the time when I’d just heard the word Zooniverse,” an early crowd-sourced-research website. Zooniverse was just six months old, and had grown from a smaller project that let home users help astronomers catalogue galaxies. “I had no idea what it was,” she says, but this new idea of people-powered research was clearly growing fast. “David thought open educational resources and open access was super, but if science wasn’t reproducible, and if data wasn’t reproducible, then we’re screwed. And I remember sitting in my car at Winchester Station thinking: Wow! This is our moment in time! We need to do something. Academe and research is this ivory tower that people can’t interact with.”

And just about then, a video from Francois Grey, pitching for funding, arrived in her inbox: a bold plan to get private individuals creating new scientific projects.

“I remember watching this video,” Turvey recalls, “and turning round to my mom and saying ‘Hey, come here and watch this.’ It was so powerful. This vision for what [citizen science] should be and how it should be and why it should be. It was incredible.”

Grey knew that it was finally possible, as he puts it, “for people to contribute in really meaningful ways to science — not just learn about science, actually do science.” NASA, for instance, had already shown, in a project called Clickworkers, that a crowd of laypeople spotting craters in pictures of Mars is — when combined — far more accurate than an expert working alone.

What excited Grey, however, was not the idea of getting more US and European academics poring over photos together — “fairly nerdy people”, as Grey puts it — but ways to involve a billion computer users spread across China, India, South America and Africa in solving real problems for researchers working in those places. And not just the grown ups. Even schoolchildren could play a part, learning as they contribute.

Laypeople and researchers were going to need tools to do this. And scientists would need to know how to set up these projects. The place he’d do that would be the Citizen Cyberscience Centre (now the Citizen Cyberlab), a joint venture of CERN, the University of Geneva and UNOSAT.

Karien Bezuidenhout, the Shuttleworth Foundation’s COO, remembers that time well. “If we were going to reimagine the world using free and open-source software principles,” she says, “that didn’t mean trying to get a little bit more transparency. It meant getting it out into the world for people to get their hands on, and being able to contribute and participate and engage and drive.”

“There’s an element to this of creating a sense of ownership and awareness with citizens about what science is, why it matters. And that’s not a thing that universities do in their ivory towers. It’s something that’s useful, immediate, relevant, and that you can participate in. And that act changes individuals.”

Not only was his vision compelling, Grey’s pitch had come at just the right time: the Foundation was ready to invest. A few months later, his funding was in place, and now he was going to need a team. Lombraña’s phone call had come at just the right time, too.


PYBOSSA

Grey’s team were technologists, but they knew that people would be the heart of their challenge and their success. Fancy computing alone wouldn’t entice volunteer thinkers. The team would have to travel to the people they wanted to reach, and train them. By 2011, they had run workshops on five continents, in Beijing, Rio de Janeiro, Berlin, New York and Cape Town, mostly using software called BOSSA, which let developers create online tasks for laypeople to complete. It was technical software, which meant they needed developers in the room. But it worked, and they could get stuck into some real challenges.

It was quickly clear how citizen-science projects could be far more diverse than spotting craters in photos. In New York, participants figured out ways to track the private-jet carbon footprint of the mega-rich. In Rio, they worked on monitoring deforestation. In Cape Town, teams developed ways to digitize Bushman history and identify breeding grounds for malaria. But the participants were running with ideas faster than the tech could follow: the BOSSA software was powerful, but too difficult to set up. It was one thing to get volunteers to complete tasks, but a pity that only technical folk could create those tasks in the first place.

In November 2011, the team had gathered in Cape Town at the African Institute for Mathematical Sciences (AIMS). Grey and Lombraña were there, and so was another Foundation fellow, Rufus Pollock. An impressive programmer and organiser himself, Pollock had been working on making government data accessible to citizens.

For some months he had been working with Lombraña on a successor to BOSSA, and they were excited to show it to the assembled crew. They’d completely rebuilt much of it, and made it easier to use. They named their new creation PYBOSSA, referring to the Python language it was now written in.

When they presented it, the non-techies in the room were a little non-plussed. They hadn’t been able to participate in setting up the projects technically anyway, and this still looked pretty daunting. But the geeks in the room knew Pollock and Lombraña were onto something special. PYBOSSA was a big step towards the holy grail for crowd-sourced research: online projects that anyone could set up without typing a line of code.

If you’d told them then that it would take four more years to get there, they might have been disheartened. Though, really, they wouldn’t have believed you anyway. Self-starters run on big ideals and healthy self-delusion.

They kept working on PYBOSSA, and, eventually, set up crowdcrafting.com to show what they could do with it. When Grey’s fellowship ended, he and Lombraña wondered how they’d keep it going. It was still small; it was still hard for a non-technical person to set up a project; but it was promising. “Francois and I discussed the idea of pitching to the Shuttleworth Foundation to keep it running, and three years later I’m here talking with you.”

Today it really is possible to set up a science project on Crowdcrafting without typing a line of code.


Solving for real life

Early software is like theory: it solves a very particular problem according to a particular model. In theory — and therefore in much of academia — software is perfect, because in theory you don’t have to deal with people, who are infinitely complex and unpredictable. To Lombraña, that made academia unbearable. Solving for real life was much more rewarding.

“I don’t see the point of simplifying the problem just to solve it,” he says, “because in the end [the real problem] will never be solved. What you are facing is not the real problem, so when you try to apply your solution it fails all the time. So that’s why we’ve kept developing PYBOSSA and moving it forward.”

“PYBOSSA is the engine,” he explains, “you need to put it in something that runs. Crowdcrafting is our [PYBOSSA] showcase.” Lombraña’s company, Scifabric, now specialises in setting up large scientific projects for institutions like the British Museum. They start with the PYBOSSA engine and then build the institution’s own project around it.

He also works with Foundation fellows on their crowd-sourcing projects: detecting fog for data-transmitting lasers, transcribing interviews, or gathering intel on the oil and gas industries.

For Lombraña, everything is a work in progress. And he knows crowd-sourced research has its critics. There have been cases in other systems, for instance, of unknown users sabotaging the data. Does that worry him?

“I think it’s a fair question, in the sense that you’re actually trusting the public, and you don’t know what’s going to happen.” He’d just come from a meeting with academic researchers where the question of vandalism had come up. “One of our strengths is that we don’t hide anything. We are fully transparent. Not only the software. The projects, the data.” And then there are ways to detect bad behaviour among users. “You can send periodic tasks that you know the answer to, just to check that they are not doing stupid things, and then you can validate based on that.” On Crowdcrafting, for instance, they had to hide the points leader board because users were gaming the system — clocking up fake tasks to score points — just to climb the board.

Lombraña sees a silver lining. “People tend to see [vandals] as a problem, and I see them as part of the solution. They are helping our software become bulletproof. Without them, it would be too perfect. So I always say that these people really are welcome. They force us to think of ways to work with them. Otherwise it’s just unicorns and rainbows. And that’s not the real world.”


Solving real problems

Since Helen Turvey’s fortuitous conversation in an English train station, the Shuttleworth Foundation has found itself funding several citizen-research projects. One of those is Safecast, a remarkable organisation tackling a terrifying problem: detecting radiation. Its co-founder, Sean Bonner, had been visiting Japan for years, and he was in Los Angeles when a massive earthquake hit the country in March 2011, causing a massive radiation leak at a nuclear plant in Fukushima.

“The natural thing is that you reach out to your friends and find out if everyone’s okay,” he explains. “So I’m pinging my friends in Tokyo to find out what’s going on.” Within 24 hours they knew about the radiation leak, so they opened up a Skype chat, pulling in about twenty people to figure out what they could do. “And what became obvious and shocking was that there was no source of public radiation data available. So that was the big question that everybody had: what are the levels? And there was no way for anybody to find out.” There was no sensor network of any kind in place.

“So let’s just get a whole bunch of Geiger counters out to people that we know in these areas, get them to go collect a bunch of data, and we’ll publish it.”

Within a few months, their little initiative had grown quickly into an organisation publishing thousands of data readings from across Japan. “And that’s when people started talking about our data and saying, ‘Wait a second, this is telling a much more detailed story.’ And it became clear that there was no way to put up with vague data any more.” Government data in particular was vague, and impossible to audit. “We started painting a much more vivid picture of what the situation was. We published data that the evacuation areas were wrong, prior to the Japanese government publishing that data.”

Today, the Safecast data set includes over 50 million data points. According to Bonner, the Japanese government dataset is about a hundredth of that.

The scale of Safecast’s data doesn’t happen without deliberately making everything open, easy to find, and credible. “Every data point is traceable all the way down to the individual device. So you can look at any piece of data on any data set and see how it was collected and when it was collected,” Bonner explains. “But with all of the official government datasets, all of that information is considered a national secret. So you can’t get access. You can’t find out from the government what device they used to take the reading. You just have to trust them that it’s good.”

Like Lombraña, Bonner finds himself collaborating often with other Shuttleworth Foundation fellows. With every fellow’s work founded on open-source principles, sharing comes naturally. Johnny West’s OpenOil just took Safecast radiation detectors to install in uranium mines. Luka Mustafa, a Foundation fellow working on open hardware in Slovenia, is developing a new design for the detector that’s easier to assemble. And Gavin Weale, a Foundation alum who runs a media agency, is bringing detectors to Gauteng, South Africa, to work with young people in measuring radiation from mine-water drainage.

That kind of collaboration, the Foundation hopes, can multiply the effect that smaller funders can have in the world. Their team spends much of their time keeping fellows talking to and supporting one another.

And would they fund other people working in citizen science?

“They’d need to solve a problem that nobody’s solving,” Bezuidenhout reckons. “I try not to imagine the kinds of projects we’d fund. But imagine someone comes and says ‘Now the key problem is the rise of water levels in the polar regions and penguins can report that on smartphones!’ I’m probably going to go: ‘Yeah, that’s interesting. It’s something that few people are actually measuring in real time. And, plus, penguins have never had cellphones before!’”

Turvey agrees. “We will absolutely wait until we find ‘the right person’. Lots of people can do something in citizen science. I’m looking for the person to do the thing that I think is most engaging and most potentially mind-shifting in citizen science. We funded [Daniel’s work] because it was Daniel, not because we were looking for someone to do citizen science.”

“You could have knocked me down with a feather when Daniel came up on our screens and was interesting,” she says. “We’d done open educational resources, we’d done open access, we’d done citizen science, and then all of a sudden there was this manifestation that was extraordinary.”


Lombraña’s Shuttleworth funding comes to an end this month — the runway is up. Now it’s up to him and his company to keep Crowdcrafting up and running, funded by the people and institutions that use it. That is a whole new journey, and one he’s been preparing for, developing projects for paying clients like the British Museum. At a conference recently, that team showed off replica Bronze Age artifacts 3D printed from designs co-created by volunteers. Volunteers had also transcribed over 30000 index cards about the artifacts, creating “one of the world’s largest electronic archeological databases.” This work brings museum collections out of the basement and into the public forum that is the Internet — and changes the game completely.

More and more institutions are discovering the power of volunteers, and unlocking it with tech like PYBOSSA. But Lombraña would be the first to admit: it’s all and only because the greatest tool we have at our disposal is, still, the human mind.


Arthur Attwell is a social entrepreneur and Shuttleworth Foundation alum. His publishing company, Fire and Lion, consults to the Foundation.