Member-only story
Text as Time Series: Arabica 1.0 Brings New Features for Exploratory Text Data Analysis
Arabica 1.0 improves time series text data analysis with an extended set of features
Introduction
In the real world, text data is frequently collected as a time series. Some examples include companies collecting product reviews where the quality of their products might change. Politicians’ public statements can vary over the political cycle. Central bankers’ announcements are one of the ways how central banks affect the financial markets nowadays. For these reasons, text data often has a time dimension and is recorded with a date/time column.
Exploratory data analysis (EDA) of these datasets is not a trivial coding exercise. And here comes Arabica — to make things simpler. This article covers the following:
- New aggregation and cleaning features in Arabica 1.0
- Real-world applications of time-series text data analysis
We will show what’s new in Arabica 1.0 using the newspaper headlines dataset during the COVID-19 pandemic.
1. What is Arabica?
Arabica is a Python library for exploratory data analysis specifically designed for time series text data…