Global Warming Analytics using Matplotlib, Pandas and Scikit.

Sudhakar Chelliah
Python’s Gurus
Published in
4 min readJun 22, 2024

Analytics of real Global Surface Temperature data from NASA.

Over the last few years, we can feel the temperature rise during summer. We are going to research the global surface temperature Index recorded by NASA over the last few decades and identify the future trend based on Python linear regression model.

Photo by Danting Zhu on Unsplash

Global Warming Plot for the last140 years:

We have loaded the complete global warming data to the data frame gw and generated the complete graph plot for the last 140 years (Green for negative Celsius and Red for Positive Celsius). Starting from 1970, the temperature seems to be slowly turned towards red and no looking back. Ice in the South Pole and North Pole is slowly melting after the Surface temperature crossed 0 degree Celsius.

import pandas as pd
import matplotlib.pyplot as plt

gw = pd.read_csv("Global_Warming.csv") # Loading the Global Warming data to gw.

x = gw['Year'] # Assigning the year values to x.
y = gw['Index'] # Assigning the Index values to y.

gw2000 = gw[(gw['Year'] > 1999)] # Assigning the dataset from 2000 to gw2000.
gw2010 = gw[(gw['Year'] > 2009)] # Assigning the dataset from 2010to gw2010.

x1 = gw2000['Year']
y1 = gw2000['Index']

x2 = gw2010['Year']
y2 = gw2010['Index']

# This line will display green color of negative Index and red for positive.
colors = np.where(gw['Index'] >= 0, 'r', 'g')

plt.figure(figsize =(20,8))
plt.subplot(2, 1, 1) # Subploting the first row for the full graph.
plt.xlabel('Year')
plt.ylabel('Index')
plt.bar(x,y, color =colors)

plt.subplot(2, 2, 3) # Subploting the second row, first column for 2000 graph.
plt.xlabel('Year from 2000')
plt.ylabel('Index')
plt.scatter(x1,y1,s =100, c = 'y', marker = "o", alpha = 1, edgecolor = 'r')

plt.subplot(2, 2, 4) # Subploting the second row, second column for 2010 graph.
plt.xlabel('Year from 2010')
plt.ylabel('Index')

plt.scatter(x2,y2,s =400, c = 'r', marker = "^", alpha = 1, edgecolor = 'b')
First Graph : Full Plot, Second Graph : Plot from 2000, Third Graph : Plot from 2010.

Identifying the values for Straight line equation:

Equation of Straight-line Y = mx + c.
m is the slope of trend and c is the height of Y-Intercept.
We are going to generate the value of Slope(m) and Y-Intercept(c) using scikit library.

#Creating dataframe with the data greater than 1974 from the complete dataset.
from sklearn.linear_model import LinearRegression
model = LinearRegression()
X = np.array(gw1974['Year'])
y = np.array(gw1974['Index'])
regression_model = LinearRegression() #Using scikit linear regression model
regression_model.fit(X.reshape(-1,1), y)
slope = regression_model.coef_ #Identifying the value of Slope
intercept = regression_model.intercept_ #Identifying the value of Y Intercept
print("Slope(m) value is ", slope, "|| Intercept(c) value is ", intercept)
NASA Surface Temperature Index (Celsius) Dataset from 1975.Year Index
1975 0.02
1976 0.04
1977 0.07
1978 0.12
1979 0.16
1980 0.20
1981 0.21
1982 0.22
1983 0.21
1984 0.21
1985 0.22
1986 0.24
1987 0.27
1988 0.31
1989 0.33
1990 0.33
1991 0.33
1992 0.33
1993 0.33
1994 0.34
1995 0.36
1996 0.40
1997 0.42
1998 0.44
1999 0.47
2000 0.50
2001 0.52
2002 0.55
2003 0.58
2004 0.61
2005 0.62
2006 0.62
2007 0.63
2008 0.64
2009 0.64
2010 0.65
2011 0.67
2012 0.70
2013 0.74
2014 0.79
2015 0.83
2016 0.88
2017 0.91
2018 0.93
2019 0.94
2020 0.95
2021 0.98
2022 1.00
2023 1.02
Identifying the Slope and Y-intercept for straight line equation.

Generating future trend based on the Slope and Y-Intercept:

def line(x):
# Assigning the Slope(m) and intercept value(c) to the function y = mx + c.
return 0.0196163 * x + (-38.71344489795919)
x_pred = range(1974,2051) #range from the year 1974 to 2051.

#As the year increases, the slope will be added based on the straight line equation.
y_pred = [line(i) for i in x_pred]

import matplotlib.pyplot as plt
fig,ax = plt.subplots()
ax.bar(gw1974['Year'],gw1974['Index'],color = ['g'])
ax.stackplot(x_pred,y_pred, color="y",linewidth = 2.5,linestyle = '--',alpha = 0.40)
ax.set_xlabel('Year')
ax.set_ylabel('Surface Temperature Index')
plt.title('Global Warming Trend')
plt.show()

Based on the straight-line equation, we have generated future trend and identified that the Global surface Index is crossing 1.4 degree Celsius at a very quick time.

Inference based on the analytics of Surface Temperature over the years:

  1. If carbon emission is not controlled/reduced significantly then the Surface Temperature will increase beyond 1.40-degree Celsius in next 24 years.
  2. If the carbon emission increases, then the surface temperature will probably cross beyond 1.50-degree Celsius before the year 2050.
  3. Sustainable energy (Solar Power, Wind Power, Hydroelectric, bioenergy) will be the only solution going forward to control the Global Warming.

After the analytics, it seems Global Warming is much more dangerous than it is actually conceived as a concept. If the carbon emission is not reduced, then there will be a catrostopic climate change across the globe.

Python’s Gurus🚀

Thank you for being a part of the Python’s Gurus community!

Before you go:

  • Be sure to clap x50 time and follow the writer ️👏️️
  • Follow us: Newsletter
  • Do you aspire to become a Guru too? Submit your best article or draft to reach our audience.

--

--