Sign in

Applied Data Science
Cutting edge data science, machine learning and AI projects

This is part of the Full Stack Data Scientist blog series. I’ve written an introductory blog here, and I’d also recommend reading the Practical Introduction to Docker before working with this post’s tutorial.

  • Web scraping
  • ETL
  • Database management
  • Feature building and data validation

And much more!

What’s Airflow, and why’s it so good?

Airflow is…


…using a Raspberry Pi, Docker, Python and Apache Superset

I’ve recently been having issues with my internet either being really slow, randomly dropping out, or not working at all — sound familiar?

To figure out if it was an issue with the internet service provider, I set up a monitor to routinely run a speed test and log the results.

It was also a perfect opportunity to try out Apache Superset - an open source dashboarding tool with a drag-and-drop interface!

📦 What I needed

✏️ Code — follow the instructions in the README file to run locally

🍓*🥧 A Raspberry Pi 4 Model B with 8GB of RAM connected via ethernet.

…*…


How AI and Nudge Theory can help us nudge our way out of difficult decisions

There are times when technology precedes scientific understanding.

We would have never gotten the steam engine if we didn’t understand how heat transforms into work. We could not have found a vaccine that eradicates smallpox if we did not know how the variola virus works.

But our society, with its political and cultural structure, evolves irrespectively of our understanding of it. Scientists, faced with a messy world where financial markets and Artificial Intelligence (AI) emerge seemingly without a warning, need to answer all sorts of funny questions:

How rational are humans when making decisions?

Do AI-enabled machines understand emotions?

Where…


An AI agent playing the game ‘Butterfly’, trained using SIMPLE

NEW reinforcement learning Python package SIMPLE — Self-play In MultiPlayer Environments

✏️ The Plan

In November, I set out to write a Python package that can train AI agents to play any board game…🤖 🎲

To be successful, the package had to meet the following objectives:

  1. It will work with any custom board game logic
  2. It will handle multiplayer games
  3. It will start tabula-rasa and learn through self-play

🎯 The Outcome

The output from this project that meets these objectives is called SIMPLE — Self-play In MultiPlayer Environments.

You simply plug in a game file that handles the game logic, hit ‘train’ and wait for it to become superhuman! 🚀


A guide to navigating through a sea of data

When it comes to data relating to COVID-19, we are all at sea.

Head over to Twitter and you will find lifeboats for every path back to shore — conspiracy rafts, government advice rafts, zero-COVID rafts and everything in between. Which one will you choose? Who can you trust?

You may come to the conclusion that none look particularly convincing. In which case, you’re left with no choice but to roll up your sleeves, find some sturdy looking logs and build your own raft. That’s where this guide can help.


Automatically generating reports is useful in a wide range of scenarios, from regularly sharing data within a company or the public, or for personal use, such as comparing the performance of different models side by side without having to manually run a Jupyter Notebook ’n’ number of times.

In my most recent project, I wanted to be able to train several models and then calculate a set of metrics and draw result exploration plots for each. I started by building a notebook with a menu at the top which would allow me to select one of the models I had…


Photo by Marius Ciocirlan on Unsplash

At the height of the lockdown restrictions, close to 9,000,000 workers were put on the Coronavirus Job Retention Scheme (see plot below). This has meant that these people have had a lot of paid ‘free’ time, but not without drawbacks, as this also meant a lower income and uncertainty about the future. At the same time, around 750,000 jobs have been lost throughout.

Could it be that this drove people to start new businesses? Maybe this allowed a lot of people time and opportunity to give their dream business a shot? …


Photo by Aron Visuals on Unsplash

Now-a-days, for one problem we can write the solution in n number of ways, but, how can we decide which type is better. We can use different types of algorithms to solve one problem. We need to compare these algorithms and have to choose the best one to solve the problem.

What is an algorithm?

An algorithm is a set of instructions, which are created to get the required output. Many different algorithms can give same output.

To perform these instructions a computer should have memory, and it also requires time to perform those actions.

What is Time Complexity?

The amount of time it takes to run the…


Photo by Roman Koester on Unsplash

Since lockdown started, it seems everyone in the UK (and probably the world) have tried to get their hands on a bicycle. Bicycle sales have increased by around 677% compared to last year, according to this Forbes article. This has thus led to retailers facing stocking issues. I was able to confirm this when I decided to buy a new bike two weeks ago, and basically every model in every shop is sold out.

Having found the only bike available in my budget I decided to get it before someone else could buy it. The retailer promised to have it…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store