The Pretence of Data

Nicolas Terpolilli
11 min readAug 9, 2017

--

What does smart stand for in Smart City?

“You can plan a pretty picnic
But you can’t predict the weather”

Ms. Jackson, OutKast

What centralized entity could decide to bring a gas station in the middle of the desert? 🤔

In 1974, F.A. Hayek received the Nobel prize and gave a famous acceptance speech, The Pretence of Knowledge. It was a brilliant demonstration of the natural inclination for overestimating our ability to understand the world and society. Mimetism with physical sciences often results in covering that lack of understanding behind math formulas and complex science that Hayek called “scientism”. Social interactions put much more variables at stake than most physical phenomena. His thesis was that when dealing with social matters, the true forecast is done organically by the system itself.

The world of data is not short of pretence. That is why it is crucial to understand more precisely how to build ecosystems and to decentralize data power. That is why every data producer should ask him or herself how smart the organization they have designed is.

How do cities become smarter?

The Smart City is still one of the most poorly described concepts. Wikipedia’s entry is particularly striking: I do not believe that bringing information and communication technologies into an organization will make it smarter. We now have around fifty years of experience, that is enough to be sure that things rarely ever go that smoothly. A city is not smarter because beacons were installed on bus stops or because any cab can be tracked from the airport to city center thanks to hundreds of CCTV’s.

The article goes on:

“ICT is used to enhance quality, performance and interactivity of urban services, to reduce costs and resource consumption and to improve contact between citizens and government”

I don’t fully disagree with that definition, but it still feels like the Efficient City, not a Smart one. Let’s agree that this is a much more difficult topic than expected. A large part of that complexity comes from the fact that a myriad of different actors are evolved in the Smart City market, and each actor often pulls in its direction. The people in charge at the city level end up lacking both the time and the resources to fully grasp the ecosystem and make the most of it.

Cities are already remarkably smart organisms. Cities are already one of the most anti-fragile ecosystems.

“Cities do not slow down as they get bigger. They speed up with size! The bigger the city, the faster people walk and the faster they innovate. All the productivity-related numbers increase with size — -wages, patents, colleges, crimes, AIDS cases — -and their ratio is superlinear. It’s 1.15/1. With each increase in size, cities get a value-added of 15 percent. Agglomerating people, evidently, increases their efficiency and productivity.” Why Cities Keep on Growing, Corporations Always Die, and Life Gets Faster, Geoffrey B. West.

Since the definition is quite difficult I will now try to explain what the Smart City is not, and try to give a definition and some kind of roadmap. This is an empirical approach and should be considered as such. Plus, if what I describe as not being a Smart City looks appealing to you, there is no shame in not wanting to build a Smart City.

The watchtower temptation

Since cities are complex and smart, the first reflex when beginning a smart city initiative is to try and gather all possible data not available yet for monitoring.

When there is money to do it, it finishes like IBM’s Rio de Janeiro “smart city.” Basically, a high-tech control center filled with tons of screens. It is impressive, but it is definitely not smart. There are four reasons why that kind of projects take shape:

  1. They are easy — yet expensive — to design and develop
  2. They bring a really pleasant sensation of control
  3. They are sold well by established vendors
  4. As Pasquale Cirillo says, they allow the people at their head to play politics:

Smart city design

For any data science to be powerful it needs to answer a question. And the alliance between mathematics and technologies is actually extremely good at answering big and complex questions.

But if technology is neither sufficient nor necessary, where does the smartness come from?

There is a massive bottleneck in the design of the centralized control center smart city: who asks the questions.

You may have thousands of sensors, gather Tera of data and have brilliant data scientists in your team, your only results will come from questions well asked. And there is no way a team of about thirty people in a tower can ask the questions that will improve the lives of millions of people. That is precisely why projects of that kind generally end up optimizing traffic lights or producing reports on air quality. It is still interesting but frustrating because of the potential of the data. It is disappointing because both scope of research and means of action are too limited.

A little aside about the Smart Phone: the key feature of the Smart Phone is to allow a variety of third-party apps to run on its system. Apple or Google never thought they could imagine every app wanted by the almost 3 billions smartphone owners in the world. They have developed an ecosystem around a technology.

Let me then suggest a definition of a Smart City:

A Smart City is a city leveraging information and communication technologies to maximize the number of problems faced by its users — the citizens —addressed and providing infrastructure to distribute the means of action.

Though I wasn’t satisfied with its definition, I do like the idea of smartness because you can be smart in a lot of different ways: I would be as proud if my daughter or my son follows the path of Einstein or Dostoyevsky, of James Brown or Grothendieck. Every city can bring its own sauce in the way it grows smarter. Every city can make its own choices, compare what works best and focus on their own development strategy.

In data production, small is beautiful

That tension between efficiency and smartness is absolutely key. Not only in city management but in the whole economy. They are not exclusives, but the tension is strong.

Carlos Ghosn vs Elon Musk

In a mature market, companies that thrive are the ones who are able to be more efficient than their competitors. It is not an easy task. There are a lot of theories on how to do that, Japanese lean management being the most famous.

Clayton Christensen, The Innovator’s Dilemma’s autor

Someone like Carlos Ghosn, Chairman and CEO of French car manufacturer Renault-Nissan is quite impressive in doing so. But no matter how efficient and effective his company has become, he couldn’t have built a Tesla.

The Innovator’s Dilemma details why efficiency and the huge inertia that come with it become a burden when trying to develop new things.

Data management in cities is much more mature than what we can imagine. Almost any city in western countries has a Geographic Information System (GIS) and people working on data. There has been work in the last ten years to improve that management efficiency.

Developing a Smart City implies creating new services and new ideas. It implies finding new Tesla Motors. The easiest path in doing so is to rethink the data distribution.

Production Never Scales

Whether you think of content, cars or ideas, production never actually scales because production implies huge operational expenditure (OPEX). A consulting firm needs to provide consultants time in developing content that will not be usable in the next project. It does not scale. It is possible to optimize, build some tool and reach a size sufficient to become huge on the market like what McKinsey or BCG did, but there will not be any winner-takes-all kind of scale.

Compare that to Google: there were huge capital expenditures (CAPEX) at first, building a search engine is not easy. The first search on Google cost millions whereas the marginal cost of today’s searches is really close to zero.

What makes the internet powerful in production of content is the fact that everyone plays a part. What makes Google thrive and scale is that they are good at distributing, not production.

If you want to learn about how a company can transform itself to scale, read Nicolas Colin’s 11 Notes on Goldman Sachs.

Just to be precise, I do not think there is a hierarchy between scalable or not organizations. Consultants are most of the time unbelievably smart people, non-scalable firms have given jobs to our whole societies. I don’t want to rank them but rather highlight the differences between the systems.

Why there is no Facebook for Data Science?

Data Science, the art of producing an answer to a question thanks to Mathematics and computer technologies, does not scale either. The technology can, but there are huge OPEX in the time spent by the data scientist. It has two consequences :

  1. there are no organizations able to provide a Data Science service that scale,
  2. at a city’s level, as good as a team can be, it will reach a limit in the number of issues it can face.

People pretending they have a solution to fix every problem with data should be listened to with caution. Complex systems will not become smarter thanks to a centralized team of smart people.

So to conclude let’s try to suggest a method to avoid Data Scientism and build up a Smart City from, well, a city.

Wisdom of crowds is a design issue

So, to paraphrase Hayek, how to build the system that will organically provide the smartness?

Wisdom of crowds is a book written by James Surowiecki, describing examples in economics and psychology where the aggregation of independent opinions outperform each of opinion. It is obviously a tremendously exciting idea, but it feels so rare. Indeed, for the crowd to become wise, it needs:

  • Diversity of opinion: Each person should have private information even if it’s just an eccentric interpretation of the known facts.
  • Independence: People’s opinions aren’t determined by the opinions of those around them.
  • Decentralization: People are able to specialize and draw on local knowledge.
  • Aggregation: Some mechanism exists for turning private judgments into a collective decision.

Designing a system where crowds have the opportunity to become wise is really hard. Human mimetism leads really quickly to crowds becoming mobs. It is crucial to focus on each of the 4 points, diversity, independence, decentralization and aggregation.

A team of Data Scientists in an organization can’t keep its long term independence, and most of the time there will not be any decentralization.

So here is my shot at providing a roadmap to develop a Smart City. It is a roadmap, not the roadmap.

Smart City Design Specs

  • Diversity of opinion: Allow any citizen to seize an issue.
  • Independence: Make it adequately easy so that people trying to do something don’t have to know what others are doing.
  • Decentralization: Any decision should be taken at the lowest level possible.
  • Aggregation: Build a community, develop an ecosystem, bring together people.

The 5-steps plan

Step 1. Develop an internal data referential. From the successful projects we see on the ground, this is the moment when someone is designated or hired to be responsible for the project. Call the function Chief Data Officer or whatever, the only thing that matters is that he or she has an extensive vision of what data is available, where and what the internal bottlenecks are, and the politics around them. This moment is also crucial to onboard the rest of the team on the philosophy of the project.

Step 2. Open Data. Once you have an internal referential it becomes quite straightforward to define which datasets are meant to be opened. Actually opening the data takes a little bit of time, but thanks to today’s solutions, it is now a no-brainer. This will be the first contact with the external world. It is really important to reassure the early adopter community about your intentions and the quality of the data you will make available. Opening a small number of datasets, mostly around one or two themes, is not an issue as long as potential rousers are convinced that this is a long term approach and that you will focus on helping them.

Step 3. Build an ecosystem. From our experience this is the most difficult step. Almost ten years later, this no longer works. Developers are fed up with hackathons, and this is not viable nor fun anymore. Building an ecosystem is much harder than building a large audience. The success of that step will be measured more in term of interactions between people in the ecosystem than in absolute number of people. The most impressive ecosystem I have seen is the one Transport for London built around its Open Data.

Step 4. Build partnerships. Once the ecosystem becomes consistent, it is time to start switching from production to distribution. You have a community that will weigh in when discussing partnerships. It’s time to become a hub. The hub for every data concerning the territory, no matter their producer. The city will become smart when organizations that work on the same issues in the same place will start working together and collaborate with data. That is the whole point of OpenDataSoft’s partnership with Waze, or the Open Data platform shared between two energy operator in France (RTE and GRTGaz).

Step 5. Decentralize the production. It is time to go one step further by distributing the data production. Any app, any citizen, any re-user should be able to work on the data, clean them, enrich them and redistribute them easily. You now operate a distribution network, a data hub, it is your role to provide them with the infrastructure. It is your role to think the incentives through. It is your role to provide with means of action, like TfL partnering with Citymapper to let the latter operate a real bus line in London.

Step 6. Anything else. I cannot explain enough that no centralized entity can provide a true forecast or tell you here that I know the only road to success. These 5 steps are mostly an empirical analysis from our experience as a data solution vendor at OpenDataSoft, but there must be many amazing ideas to try that are not detailed here.

Focus, replicate, scale

“Dominate a small niche and scale up from there, toward your ambitious long-term vision.” Peter Thiel, Zero to One

If there is something we can learn from Silicon Valley’s method it is probably that quote from Peter Thiel.

As you design and build your Smart City, you should start by what’s the most important for you (transportation, energy, real estate, local businesses, budget and so on). Focus on that topic, and follow the five steps. Once the system is actually smarter, replicate it on other priorities.

That is the best way to:

  • get to know the power users among your community
  • build strong and agile processes into your organization
  • allocate your resources efficiently

Existence precedes essence, hence it is important not to forget that your substance has to precede its brand. Do not focus on communication. Focus on designing a viable and performant system. Focus on building strong ties with people and fight for it internally. Brand will follow naturally.

As the designer of a smart system, your role is to leverage tech to build a large ecosystem of people solving their own issues.

“Buy the ticket, take the ride”

Hunter S Thompson

--

--