Is E-Commerce a catalyst for economic development in less developed countries?

Yada Pruksachatkun
4 min readJun 26, 2016

--

As an export from Thailand in Silicon Valley, I came here for the immense concentration of startups and startup people that the Valley boasts. However, as I explored the ecosystem, I realized a trend in the types of conversations that tended to happen. They revolved around the hype of the season (VR and machine learning), the types of founders YC and Thiel chooses. Of course, as any rational international student would (sarcasm here), I naturally started to apply those questions to the international startup ecosystem.

What were the most popular types of startups in other countries? What is the “ideal” founder in each country? Is it possible to find such a founder using data?

After about two weeks of that question nagging in my mind, I decided to do something about it and sat down this afternoon to try to figure it out. Here is a low-down of how that exploration went:

Part 1: A goose-hunt in disguise
The hardest part of this was in actually getting the data. After about two hours trying to find startup data, I was just about ready to create a script to scrape TechCrunch or Crunchbase. Fortunately, at the last second, I found an article by Mode Analytics, a data analysis that allows you to create reports in SQL and Python, which contained a database of a scraped dataset from Crunchbase, with fields such as number of fundraising raised, category of startup, and other fields for every startup on Crunchbase in each country. Score!

Second was in parsing the data. I used SQL and Python to narrow down the dataset to a few fields: the category of startup, the amount of startups in each startup category for each country (summation lambda function), before picking the startup category with the highest number of startups (the latter of which was done in Python).

For all you curious folks, here was was the Python code used using Pandas, where df is a dataframe containing the summation of each type of startup for each country.

df[df[‘count’] == df.groupby([‘company_country_code’])[‘count’].transform(max)]

Here’s a snapshot of the CSV that was generated showing the number of startups of each category for each country, ranked by the highest-numbered categories.

So sexy right

Part 2- Figuring out what the heck to do

So after a little bit of exploring with the CSV, a map seemed to be the natural way to display the data, especially when you’re trying to find trends in a global scale. However, that in itself was a challenge. I really didn’t feel like coding this afternoon, so tried to find a service to create data-backed map visualizations. After another thirty minutes of trail and error, I discovered that in fact, there wasn’t (hackathon idea!).
Defeated, I resolved to using Datamaps.js (legit a module made in heaven).That ended up being a good choice, and with a lot more parsing (writing Python scripts to actually do the parsing for me of course), I ended up with a map in about 40 minutes. Nice. In the process, I also discovered that there is no Python map converting between ISO and country names, so created one and posted it on Github here!

Here was the end graph:

And the legend:

0

Some interesting observations: E-Commerce seemed to be prevalent in South America, and Asia, developing countries and emerging economies. The main economic Western players: Australia, Europe, and the US, on the other hand, were in fact not, opting for software, biotech, and curated web in Europe, and biotech for both US and Australia. Also, there aren’t a lot of African startups on Crunchbase, at least not enough to escape the default black color, but the ones that are are in small business and education (no surprise there).

Of course, the obvious next question is, why is that so? Why is e-commerce so prevalent in emerging economies?

For many developing countries, dropshipping is a very attractive source of revenue. Dropshipping consists of whole sale manufacturers who deliver and sell to online shops (as opposed to distribution channels where online shops have to keep and maintain tehir own inventory). Countries such as China and parts of Latin America are known for their warehouses, which explain the boom in E-Commerce startups. However, for more developed countries, software and biotech is the trend due to the amount of university and facility resources and talent pool available for companies. Columbia was a surprise at first — why social media? However, plane prices in Columbia and America are very cheap, meaning that there is a huge influx of tourism. This means high demand for social media outlets to express their adventures.

My friends Gustavo, J-Lo, and I have a few more theories, but let’s leave that for next time ;) Also for next time, scraping more recent data from Crunchbase for 2016.

If you want the parsed dataset used for the graph, feel free to reach out to me at yadacmis@gmail.com

Until then, salut!

--

--

Yada Pruksachatkun

Graduate Student @ NYU CDS, previously @Facebook @MIT Media Lab @FinExploration. I write about machine learning and life. คนเชียงใหม่