Animate the timelapse of 2023

Konbraphat
2 min readJan 9, 2024

On 8 Jan 2024, we released a Python library called AnimatedWordCloud

PyPI: https://pypi.org/project/AnimatedWordCloudTimelapse/
GitHub: https://github.com/konbraphat51/AnimatedWordCloud

As the name says, this library makes an animated word cloud from time-lapse data.

Here I will show you how to make an animation of 2023, using data of Guardian article tags.

If you just want to have the code now, run this notebook at Colab: https://gist.github.com/konbraphat51/2f22568479061d5dc5b3f3ef170dd352

1. Fetching Guardian Data

We are using the Guardian article dataset by kiet21042003 in Kaggle Datasets.

Just download from the link above, or you can download by Kaggle API.

import os
os.environ['KAGGLE_USERNAME'] = "" #@param {type:"string"}
os.environ['KAGGLE_KEY'] = "" #@param {type:"string"}
!kaggle datasets download -d kiet21042003/news-articles-of-the-guardian-112015-23112023

import shutil
shutil.unpack_archive("/content/news-articles-of-the-guardian-112015-23112023.zip", '/content')

2. Clean Data

We want to use the Time column of the data frame, but it seems that the last timezone part disabling pandas.to_datetime() .

Thus we have to cut it before the datetime conversion.

def rid_timezone(x):
try:
return x[:-5]
except:
return "Thu 1 Jan 2015 23.11"

df["DateTime"] = df['Time'].apply(rid_timezone)

df["DateTime"] = pd.to_datetime(df['DateTime'], format='%a %d %b %Y %H.%M')

Now we explicit the 2023 data

from datetime import datetime

df_target = df[(df["DateTime"] >= datetime(2023, 1, 1)) & (df["DateTime"] < datetime(2024, 1, 1))]

3. Prepare Timelapse Data

The timelapse data need to be like:

[(time_name, {word: weight})]

import ast

stopwords = {"news", "features", "The Observer", "reviews"}
timelapse = []

for month in range(1, 12):
word_vector = {} # word -> weight

if month < 12:
df_month = df_target[(df_target["DateTime"] >= datetime(2023, month, 1)) & (df_target["DateTime"] < datetime(2023, month + 1, 1))]
else:
df_month = df_target[(df_target["DateTime"] >= datetime(2023, 12, 1)) & (df_target["DateTime"] < datetime(2024, 1, 1))]

for tags_str in df_month["Tags"]:
#the raw data is all string, so convert to Python list
tags = ast.literal_eval(tags_str)

for tag in tags:
if tag in stopwords: continue

# count each tags
word_vector[tag] = word_vector.get(tag, 0) + 1

timelapse.append(
# to tuple
(
str(month), #time name
word_vector #word dictionary
)
)

Now we have timelapse data in timelapse

4. Animate

Install AnimatedWordCloudTimelapse from PyPI.

pip install AnimatedWordCloudTimelapse

and call AnimatedWordCloud.animate()

from AnimatedWordCloud import Config, animate

config = Config(
output_path="/content", #for colab
min_font_size=15,
image_width=1000,
image_height=1000,
)

animate(timelapse, config)

Then the gif animation is made.

If you want to display the animation to your notebook, you can write as this:

from IPython.display import display, Image

with open('/content/output.gif','rb') as f:
display(Image(data=f.read(), format='png'))

Please give a star to our library!

https://github.com/konbraphat51/AnimatedWordCloud

--

--