Price, Date and IBAN Data Extraction with Python

In this post, I show handy Python libraries to extract and process such information as price, date, and IBAN. It is hard to process this kind of data, but with proper libraries is simple.

Andrej Baranovskij
Katana ML

--

Author: Andrej Baranovskij

It may look like a simple task to parse dates, currencies, and IBAN’s. But think for a moment about all the different combinations, locales, and formats. Parsing USA or German format dates, extracting decimal values from prices in EUR, USD, or Rupees. A simple task at first can get really messy.

Luckily there are Python libraries that we can use, instead of coding all of the rules by ourselves.

This is a part of data preparation, essential for any Machine Learning application.

Date parsing

Recommended library — dateparser

In this example we parse the date in German format, we can give a hint to the library about the language for the date format:

d = dateparser.parse('2.Mai 2020', languages=['de'])

The result looks great:

2020-05-02 00:00:00

We can try to pass the invalid date to the library:

--

--