This is part of the Full Stack Data Scientist blog series. I’ve written an introductory blog here, and I’d also recommend reading the Practical Introduction to Docker before working with this post’s tutorial.
Building end-to-end data science solutions means developing data collection, feature engineering, model building and model serving processes. It’s a lot of stuff to stay on top of, right? To keep myself sane, I use Airflow to automate tasks with simple, reusable pieces of code for frequently repeated elements of projects, for example:
And much more!
I’ve recently been having issues with my internet either being really slow, randomly dropping out, or not working at all — sound familiar?
To figure out if it was an issue with the internet service provider, I set up a monitor to routinely run a speed test and log the results.
It was also a perfect opportunity to try out Apache Superset - an open source dashboarding tool with a drag-and-drop interface!
✏️ Code — follow the instructions in the README file to run locally
🍓*🥧 A Raspberry Pi 4 Model B with 8GB of RAM connected via ethernet.
There are times when technology precedes scientific understanding.
We would have never gotten the steam engine if we didn’t understand how heat transforms into work. We could not have found a vaccine that eradicates smallpox if we did not know how the variola virus works.
But our society, with its political and cultural structure, evolves irrespectively of our understanding of it. Scientists, faced with a messy world where financial markets and Artificial Intelligence (AI) emerge seemingly without a warning, need to answer all sorts of funny questions:
How rational are humans when making decisions?
Do AI-enabled machines understand emotions?
In November, I set out to write a Python package that can train AI agents to play any board game…🤖 🎲
To be successful, the package had to meet the following objectives:
The output from this project that meets these objectives is called SIMPLE — Self-play In MultiPlayer Environments.
You simply plug in a game file that handles the game logic, hit ‘train’ and wait for it to become superhuman! 🚀
When it comes to data relating to COVID-19, we are all at sea.
Head over to Twitter and you will find lifeboats for every path back to shore — conspiracy rafts, government advice rafts, zero-COVID rafts and everything in between. Which one will you choose? Who can you trust?
You may come to the conclusion that none look particularly convincing. In which case, you’re left with no choice but to roll up your sleeves, find some sturdy looking logs and build your own raft. That’s where this guide can help.
Automatically generating reports is useful in a wide range of scenarios, from regularly sharing data within a company or the public, or for personal use, such as comparing the performance of different models side by side without having to manually run a Jupyter Notebook ’n’ number of times.
In my most recent project, I wanted to be able to train several models and then calculate a set of metrics and draw result exploration plots for each. I started by building a notebook with a menu at the top which would allow me to select one of the models I had…
At the height of the lockdown restrictions, close to 9,000,000 workers were put on the Coronavirus Job Retention Scheme (see plot below). This has meant that these people have had a lot of paid ‘free’ time, but not without drawbacks, as this also meant a lower income and uncertainty about the future. At the same time, around 750,000 jobs have been lost throughout.
Could it be that this drove people to start new businesses? Maybe this allowed a lot of people time and opportunity to give their dream business a shot? …
Now-a-days, for one problem we can write the solution in n number of ways, but, how can we decide which type is better. We can use different types of algorithms to solve one problem. We need to compare these algorithms and have to choose the best one to solve the problem.
An algorithm is a set of instructions, which are created to get the required output. Many different algorithms can give same output.
To perform these instructions a computer should have memory, and it also requires time to perform those actions.
The amount of time it takes to run the…
Since lockdown started, it seems everyone in the UK (and probably the world) have tried to get their hands on a bicycle. Bicycle sales have increased by around 677% compared to last year, according to this Forbes article. This has thus led to retailers facing stocking issues. I was able to confirm this when I decided to buy a new bike two weeks ago, and basically every model in every shop is sold out.
Having found the only bike available in my budget I decided to get it before someone else could buy it. The retailer promised to have it…