Decoding Jacinda Ardern — (Analysis using Python)

Kaarthikandavar
5 min readJun 19, 2021

Before we dive, this article is all about analyzing news surrounding New Zealand and PM Jacinda Ardern during the pandemic year (2020).

Well, who is Jacinda Arden….

yes, everyone in the world had to look back at this incredible leader for her effective plans and strong measures which made her country New Zealand a COVID-free nation.

Rewinding back to March 2020, that's when the first covid case appeared in New Zealand and all it took was just under two months to make the number to ZERO cases. If this isn’t astonishing what else!
Two things that stood out in this success formula was:
> Quick action
> No compromise on taking those actions

Like the guy in the previous meme, even I was interested to know the answers to what? how? :)

Being a Data Analyst myself, I wanted to address the analysis based on the data and not giving a very generalized idea about what happened.

Data was taken from — GDELT (Global Database for Events, Languages, and Tone) Supported by Google Jigsaw, the GDELT Project monitors the world’s broadcast, print, and web news from nearly every corner of every country in over 100 languages and identifies the people, locations, organizations, themes, sources, emotions, counts, quotes, images and events driving our global society every second of every day, creating a free open platform for computing on the entire world.

The required data was scraped from Gdelt-GKG 2.0 which contains more than 500 features including (date, text, news website, person, sentiment scores, etc. ), I have just filtered to the required features for analysis.

Filtered columns

the location was then filtered to New Zealand and persons containing ‘Jacinda Ardern’ the below image shows the first 5 rows of the data.

first 5 rows

We could see the ‘source_name’ contains full domain, to extract the top level

df['domain']=df['source_name'].apply(lambda x: '.'.join(str(x).split('.')[1:]))

Now, let's see where most of the news for New Zealand is coming from

fig, ax = plt.subplots(figsize=(12, 10))
# palette = sns.color_palette("Paired")
ax = sns.countplot(y='domain', data=df, order = df.domain.value_counts().reset_index()['index'].tolist()[:20])
ax.set_title(' Domain - (2020) All NZ NEWS ', fontsize = 16, loc='center', fontdict=dict(weight='bold'))
for p in ax.patches:
width = p.get_width()
ax.text(width + 1,
p.get_y() + p.get_height() / 2,
'{:1.0f}'.format(width),
ha = 'left',
va = 'center')
plt.savefig('images/domain.png')

well, it is known that Australia and UK talk more about NZ than others.

We have a feature named ‘tone’ which ranges from negative to positive value indicating the sentiment of the news article, so I had created a feature ‘sentiment’ from it.

df['Sentiment'] = df['tone'].apply(lambda x: "Positive" if x>0 else ("Neutral" if x==0 else "Negative"))

Now, let’s see how these domains were divided based on sentiment,

More negativity, it could be because of other topics or that's how the news channels arejust highlighting the negativity :(

This plot shows how many of those were mentioning the PM in their article. Well, we could now just drill through and see how different news channels were mentioning the PM and what sentiments they were spreading.

ANALYSIS OF NEWS CHANNELS

nzherald being the leader here with more number of articles publishing about NZ but let's dive and see how they all figured based on the sentiment

fig, ax = plt.subplots(figsize=(12, 10))
# palette = sns.color_palette("Paired")
ax = sns.countplot(y='source_name', data=df.query('jacinda=="Jacinda Ardern"'), order = df.query('jacinda=="Jacinda Ardern"').source_name.value_counts().reset_index()['index'].tolist()[:10], hue = 'Sentiment')
ax.set_title('Top News Channels mentioning Jacinda and their sentiments in their article', fontsize = 16, loc='center', fontdict=dict(weight='bold'))
ax.legend(title = 'Sentiment')
def barPerc(df,xVar,ax):
numX=10
bars = ax.patches
for ind in range(numX):
hueBars=bars[ind:][::numX]total = sum([x.get_width() for x in hueBars])for bar in hueBars:
width = bar.get_width()
ax.text(width+2,
bar.get_y() + bar.get_height() / 2,
f'{bar.get_width()/total:.0%}',
ha="left",va="center")
barPerc(df.query('jacinda=="Jacinda Ardern"'), df.query('jacinda=="Jacinda Ardern"').source_name, ax)

Sadly the question here is How Negative?! apart from scoop most of others negativity threshold is simply ‘HUUUUGE’, what's the reason? is it the people who are more thrilled to click the negative news that made these channels come up with negatively attractive headlines, or these channels spread negative politics just to get attention.?

Looks like NEGATIVE > POSITIVE you know what I mean 😐

One such stuff that I like about GDELT is that they have come up with ‘themes’ which summarize the article and match a topic. At present, GDELT has more than 50,000 themes in its dictionary.

ANALYSIS ON THEMES

df2 = df1[(df1.d_country == 'NZ') & (df1.month.between(1,3))].groupby('themes').agg({'tone':'mean', 'pos': 'mean', 'neg':'mean'}).reset_index().sort_values(by= 'tone').reset_index(drop = True)

I have filtered topics of interest which are related to ‘Government’ and ‘Coronavirus’ from those 50,000.

plt.figure(figsize=(20,10))
sns.scatterplot(data= c, x='pos',y='neg', s=100)
plt.title('Themes related to Jacinda in the first 3 months of 2020 (only NZ news)',fontsize=20)
plt.xlabel('Positive score',fontsize=16)
plt.ylabel('Negative score',fontsize=16)
for i in range(len(c)):
plt.text(s =c.iloc[i,0], x=c.iloc[i,2]+0.01,y=c.iloc[i,3]+0.01, fontsize=10)
plt.show()
Pre-Covid statistics

The x-axis here is the positive tone and the y-axis is the negative tone, now you know the answer to ‘Why negative’ it is the government-related news that is most negative compared to the virus but let us see how it changed once-covid arrived.

During— Covid Statistics

Corona virus-related news has moved up to the right (Positive) which is evident how things changed in NZ.

To finish things off, thanks for being with me till this point looking forward to your comments and suggestions.

Ending it with the word cloud of all news that mentioned alongside Jacinda Ardern.

Thank you :)

--

--