Nowadays, it’s common for developers and data analysts to be multi-cloud users, switching between two or more cloud providers in their day-to-day tasks. Amazon Web Services (AWS) has long been the industry leader, controlling a significant portion of the market, but recently there’s been a shift towards other providers, while also utilizing several different solutions.

Being a user of both the Google Cloud Platform (GCP) and AWS, I decided to put together an app that translates between the two providers’ CLI commands. I often found myself researching a GCP command equivalent for AWS (and vice versa), and decided that it…

The Adventures of Tintin remains a classic series which brings joy to readers who pick up the book for their first or umpteenth time. To celebrate this series, we decided to create a map which provides both a time and location-based illustration of Tintin’s global escapades.

Cloud Data Loss Prevention (DLP) is one of the many cloud security products that Google has to offer, allowing users to mask personally identifiable information (PII) in their data.

Google’s DLP can be used via the GCP console or the API; for the purpose of this article, I will focus on the latter.

If your data includes PII such as email addresses, DLP offers several ways to mask this information. For example, we can replace the PII with hashtags:

Empire State Building, New York

Mapbox offers users a simple way to build high-quality, interactive maps, and their integration with React makes this process even easier. Using a combination of Mapbox Studio — for designing the map style — and the react-map-gl wrapper — for building the map itself — , creating online maps has become very fun.

I came across Mapbox’s flyTo function and decided to leverage it in a map that visualizes Art Deco buildings around the world. …

With the TD Ameritrade API, analyzing stock market data has never been so easy. With just a little set up and a few lines of code, users have access to a whole host of stocks and options data.

Generate a Consumer Key

In order to use the TD Ameritrade API, we need a consumer key. This can be found by accessing TD’s developer website, creating an account, and then requesting a token. Once API access has been granted, you will be provided a consumer key:

While Mapbox movement data is showing a strong decrease in vehicle movement across the globe, one mode of transportation has been rapidly picking up speed — cycling. Given the compelling combination of physical wellbeing and the ability to explore one’s surroundings, potentially with others while still practicing social distancing, it’s no surprise that bike shops across the United States are working to keep up with the newfound demand.

Cloud Functions is a serverless hosting option offered by the Google Cloud Platform (GCP), allowing users to deploy event-driven code without having to manage the underlying hardware.

With so many GCP products available, the question becomes: when should we use Cloud Functions, as opposed to other offerings such as App Engine, Compute Engine, Cloud Run, Kubernetes, etc.?

There are numerous scenarios that would merit using Cloud Functions — one big indicator that it would be helpful is if you want to set up a trigger between different GCP products.

Let’s take this example scenario: I want to run an object…

Running large, computationally intensive R scripts is often a very slow process; how can we upgrade to a more powerful server, while keeping costs low?

The Google Cloud Platform (GCP) allows us to use Virtual Machines (VM) of various configurations and only pay for what we use. This pricing structure makes it feasible to run a large and intensive script for relatively cheap. And if your workflow is fault-tolerant, preemptible machines can reduce your total expenditure significantly more.

For the purpose of this article, I will use an n1-standard-16 machine (16 vCPUs and 60 GB of memory), with a Debian…

I’ve been working with Cloud Storage Fuse for over a year now and have found it to be useful in automating/improving data engineering project workflow. As an adapter to the open-source FUSE implementation, Cloud Storage Fuse allows us to mount a GCS bucket as a file system.

This keeps our syntax clean, as instead of multiple lines of code devoted to reading/writing via the GCS client library, we can refer to our files with local paths.

Here’s how to get it set up:

Note: I am using a Ubuntu 18.04 LTS image in my Compute Engine VM; GCS Fuse does…

Transfer fees are a never-ending discussion in soccer, with every summer bringing higher prices and more extravagant deals.

I decided to visualize the rise of transfer fees over time, using an animated bar chart, grouped by position. For this, I relied on matplotlib and two very useful Medium articles by Gabriel Berardi and Pratap Vardhan.

Data was gathered via ewenme’s git repo, which was originally scraped from Transfermarkt. To begin, I merged the individual datasets from the repository:

mytemplist=[]#loop over folders in directory
for folder in os.listdir(os.getcwd()):
if '.' in folder…

Stefan Gouyet

Data Engineering Freelancer | Cloud enthusiast | New York + SF Bay

