TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Text as Time Series: Arabica 1.0 Brings New Features for Exploratory Text Data Analysis

6 min readOct 20, 2022

--

Photo by Sincerely Media on Unsplash

Introduction

In the real world, text data is frequently collected as a time series. Some examples include companies collecting product reviews where the quality of their products might change. Politicians’ public statements can vary over the political cycle. Central bankers’ announcements are one of the ways how central banks affect the financial markets nowadays. For these reasons, text data often has a time dimension and is recorded with a date/time column.

Exploratory data analysis (EDA) of these datasets is not a trivial coding exercise. And here comes Arabica — to make things simpler. This article covers the following:

  • New aggregation and cleaning features in Arabica 1.0
  • Real-world applications of time-series text data analysis

We will show what’s new in Arabica 1.0 using the newspaper headlines dataset during the COVID-19 pandemic.

1. What is Arabica?

Arabica is a Python library for exploratory data analysis specifically designed for time series text data…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Petr Korab
Petr Korab

Written by Petr Korab

Python engineer /NLP / data Viz. Text Mining Stories founder textminingstories.com

Responses (3)