String processing made easy with these simple pandas functions

Image by ciggy1 from Pixabay

Introduction

If you have been using the pandas library in python you may have noticed that a lot of data comes in textual form instead of pure numbers as some people may imagine.

This means there is a need to clean and preprocess string so it can be analyzed, consumed by algorithms, or shown to the public. Luckily pandas library has its own part that deals with string processing.

In this article, we will walk you through this part of the pandas’ library and show you the most useful pandas string processing functions. You will learn how to use:


Cloud Computing

Learn about the benefits of data lakes and how to set them up quickly with AWS Lake Formation

Image by Walkerssk from Pixabay

Introduction

Every day, big and small companies collect more and more data. Enterprises typically gather data about companies’ operations, clients, competition, products etc. They need to store, process and analyze all this information in an efficient manner.

The traditional solution of setting up warehouses and databases is simply not up to the task of satisfying the companies’ needs as they deal with very large amounts of data. These solutions also don’t facilitate the usage of analytics or machine learning techniques that have become very popular in recent years.

The problems with traditional warehouses initially led to the development of cloud storage…


Machine Learning

Image by Gerd Altmann from Pixabay

Introduction

This article will teach you how to set up a Machine Learning project with a DagsHub: a new tool designed to ease collaboration on Data Science projects that require data versioning and model building.

Today, we will talk about:

The Problem with Traditional Git and DVC Setup

Most of the collaboration…


… with Jupyter Notebooks Themes

Image by Pexels from Pixabay

Introduction

Jupyter Notebook is a great programming environment and often the most popular choice for data scientists or data analysts that are coding in python. Unfortunately, its default settings do not allow the level of customization that you have with standard programming environments such as PyCharm or similar tools.

Jupyter Notebooks themes are trying to diminish this gap and allow you to make the notebook a bit prettier and also more functional using the themes. In this article, I will walk you through the installation process of Jupyter Notebook themes and show you some of their most important features.

Installation


Install these jupyter notebook extensions to boost your productivity.

Image by Bessi from Pixabay

Introduction

This article is a follow-up to my previous article that has introduced you to Jupyter notebook extensions and has presented my favorite ones. If you have not read it yet you probably should start there as it introduces the general concepts such as:

1) what are Jupyter notebook extensions,

2) how to install them,

3) and presents some basic but very useful extensions.

The list of extensions demonstrated in my previous article is not exhaustive so the goal of the current article is to add a few other, more complex but still very beneficial extensions to the previous list.


And use them in Google Colab…

Image by Speedy McVroom from Pixabay

Introduction

In this post, I will explain how to add large data sets to Google Drive so they can be accessed from Google Colab for processing and modeling.

Whereas uploading a single file can be done with the drag and drop interface of Google Drive, it becomes more difficult with a large number of files. Dragging the whole folder containing 1GB of files just fails and freezes Google Drive. The alternative is to drag a zipped folder. …


And why you should start using them

Image by Alexandr Ivanov from Pixabay

What are f-strings?

In this article, we will talk about f-strings and their advantages over regular traditional string formatting in python.

F-strings have been introduced in Python 3.6 and allow for easier and more convenient formatting. The syntax to define an f-string is almost identical to defining the string itself. You use quotes but proceed the opening quote part with the lower case ‘f’ or upper case ‘F’. Below there are a few examples.

As with normal string definition, you can use single quotes:

my_first_f_string = f'I am going to use f-strings form now on.'

Or double quotes:

f_string_with_double_qoutes =…


Boost your productivity by learning the most useful commands

What are the magic commands?

Magic commands are special commands that can help you with running and analyzing data in your notebook. They add a special functionality that is not straight forward to achieve with python code or jupyter notebook interface.

Magic commands are easy to spot within the code. They are either proceeded by % if they are on one line of code or by %% if they are written on several lines.

In this article, I am going to list magic commands that are used most often and show practical examples of how to take an advantage of…


Level up your pandas knowledge with these four functions

Image by AD_Images from Pixabay

Introduction

Pandas library is probably the most popular package for performing data analysis with python. There are a lot of tutorials that go through pandas basic functions but in this article, I would like to share some pandas functions that are a bit less known but can be very handy for everyday data analysis tasks.

Let’s get started.

Load data

Before we get to know the functions chosen for this article we need to load some data. We will work on a data frame that we will create with the following code.

import pandas as pd
client_dictionary = {'name': ['Michael', 'Ana'…


Learn this easy and simple technique to tune your Machine Learning models

Image by PollyDot from Pixabay

Introduction

Once you have built a machine learning model you would like to tune its parameters for optimal performance. The best parameters would be different for each data set therefore they need adjusting so the algorithm can gain its maximum potential.

I have seen many beginner data scientists doing parameter tuning by hand. This means running the model, then changing one or multiple parameters within the notebook, waiting for the model to run, gathering results, and then repeating the process again and again. …

Magdalena Konkiewicz

Data Scientist, NLP and ML enthusiast and educator. Blogging from: aboutdatablog.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store