Figure 1: Combined wealth of the poorest 50% of the population, of the middle class (40–60th percentile) and the top 0.001%. All data was collected from the Wealth Inequality Database (

It’s 1984 in the United States and Reagan just started his second term. Mark Zuckerberg, who will one day become the third richest person in the world, is about to see the sun for the first time. The economy is booming and unemployment is finally being reduced. In that year, the poorest 50% of the population had a combined wealth of $600 billions, the middle class owned wealth valued on $1.5 trillions, and the top 0.001% held $358 billions (Figure 1).

Or why Apple pays 8% taxes on foreign profits and accumulates due tax payments offshore indefinitely.

The release of the Paradise Papers has again put Apple in the spotlight for tax avoidance, showing that Apple has shifted their profits from Ireland to Jersey. Apple has since replied with the tax mantra: “Apple pays every dollar it owes in every country around the world.”. Contrary to the norm, this time they have provided a detailed explanation:

Their response can be summarized in four points:

  • Apple is the largest tax payer in the world (for instance 7% of all the corporate tax…

Or why political parties don’t make sense anymore.

I have to recognize that the first time I heard the question “Why do we have political parties?” I started babbling “bec… bec… because it has always been like that” (this is almost invariably a stupid answer). After reading and asking people, I now know that it is because “citizens cannot know all the politicians, but they know the ideas of the political parties, thus it is a way to simplify the voting process by giving them a brand to identify the candidates”.

However the first representative democracies were not intended to be partisan. Political parties emerged from the differences…

The Bayes Theorem describes the probability of an event happening, taking into consideration the conditions that can affect it. For example: the number of HIV+ people are 1 in 250, the probability of testing positive if you are healthy is around 1%. Then the probability of being actually HIV+ if you test positive is only 30%. Seems counter-intuitive. The intuition is that there are so many more healthy people than sick that most of the positive results are false positives. But I’ll explain the Bayes Theorem with drawings.

Let’s use the following example, where 1 in 10 people are sick…

Resumen España: Más precariedad (15% de los españoles cobrando menos que el salario mínimo). Más productividad que la media europea. Los sectores pujantes son la industria manufacturera, construcción y hostelería.

Evolución del número de horas trabajadas al año

Efectivamente, el número de horas trabajadas está aumentando.

Empleo en el último año por sector

El top 3 de sectores: Industria manufacturera, construcción y hostelería.

Para las comunicaciones hemos calculado el tiempo en transporte público (bus, metro, tren) usando google Maps [1] entre todos los puntos de Madrid a los 10 puntos dónde más gente se desplaza [2]. Para filtrar en función de la densidad de población extrajimos los datos de oferta de vivienda de idealista [3]. El código estará disponible aquí en los próximos días.

Para ver un análisis de las mejores zonas dónde vivir pincha aquí.


Zonas a mejorar: Podemos mejorar las comunicaciones invirtiendo en zonas mal comunicadas con mucha densidad de población.

El mapa 1 muestra en como en general, las zonas…

Una visualización sobre las comunicaciones, precio y oferta de vivienda en Madrid. Para las comunicaciones hemos calculado el tiempo en transporte público (bus, metro, tren) usando google maps [1] entre todos los puntos de Madrid a los 10 puntos dónde más gente se desplaza [2]. Extrajimos los datos de precio y oferta de vivienda de idealista [3]. El código estará disponible aquí en los próximos días.

Precio del alquiler

Por vivienda, si contamos los metros cuadrados el centro sería aún más caro, y la periferia más barata.

Oferta de viviendas

My mental scheme.

In Spanish, letters “b” and “v” are both pronounced identically, with an English equivalent — the “b”. However, the English “v” is different. To learn to pronounce it, I mixed a “b” and “f” and I was happy.

When I was in the Netherlands, I realized they are even more complicated. Dutch “b” and “f” are pronounced like in English, but Dutch “w” is pronounced like a mix of English “b” and “v” and Dutch “v” is pronounced like a mixed of English “v” and “f”.

The visualization is below. I see the sounds between “b” and “f” as a continuous, and you can place sounds wherever you want. Linguistics are interesting.

Machine Learning to compare and join heterogeneous data from heterogeneous sources.

You can get the code clicking here.

Merging databases is a complicated business, names are sometimes different, even when they refer to the same thing (see examples below). This is caused by spelling mistakes, different conventions, etc.

Here I merge two databases of movie/shows records to compare different string matching methods. I find that machine learning performs better than any single algorithm by itself, although some of them are pretty close. The data itself is very noise and therefore difficult to characterize. Some examples:

B-Side vs. B-Sides: Match


Análisis de sentimiento en las noticias de ELPAIS.

Figura 1. Positividad de idiomas en distintos medios. El valor de positividad de cada palabra es la media de lo que cientos de hablantes nativos piensan sobre esa palabra. Dodds et al, PNAS 2015.

Vamos a estudiar el giro en el tono de las noticias que han sido publicadas por ELPAIS sobre cuatro partidos políticos (PP, PSOE, Cs y Podemos) en los últimos cuatro años.

Para ello voy a usar “análisis de sentimiento”. Básicamente asignamos un valor de positividad a cada una de las palabras (Dodds et al., 2013) y medimos la positividad de una noticia según las palabras que aparecen. Aunque el 5 es neutro, todos los idiomas tienen una preferencia a usar palabras positivas, especialmente el español (Dodds et al., 2015 y Fig. 1).

Javier GB

Data Juggler. Computational Scientist.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store