Heatwave in Bangalore: A Statistical Odyssey

A father-daughter team navigates rising April temperatures through data analysis, strengthening bonds amid data exploration.

Rajesh R.
ILLUMINATION
7 min readApr 24, 2024

--

In the quiet of their Bangalore home, amidst the stifling embrace of the impending summer, Ryka, a soon-to-be college student with a penchant for numbers, observed the world beyond her window with a contemplative gaze. “Daddy,” she uttered, her voice tinged with concern, “Bangalore feels hotter every passing year. Look at this, April 2024, and it’s already 37 degrees.”

Arvind, a seasoned technocrat weathered by years of deciphering digital intricacies, strode purposefully to her side, his gaze as keen as the lines of code he’d forged through the annals of time. Fixing his eyes on the world beyond, a silent witness to the sweltering rise in temperature, he spoke with measured assurance, “Let’s not rush our judgment just yet,” his voice steady, betraying none of the turmoil within. “First, we must lay the groundwork with a hypothesis.”

Ryka raised an eyebrow, intrigued. “A what?”

“A hypothesis,” Arvind repeated, his voice firm. “We need to test if this heat is just another Bangalore summer or something more.”

Ryka nodded, eager to understand. “So, what are we testing?”

Arvind’s eyes gleamed with purpose. “We’ll start with the null hypothesis: 37 degrees in April is normal for Bangalore. The alternative? Well, that’s where it gets interesting.”

Ryka leaned in, hungry for answers. “And what’s the alternative?”

Arvind smiled, a glint of challenge in his eyes. “The alternative suggests this heat is different. It’s not your usual April sizzle. There’s something new in the air.”

Ryka nodded, the pieces starting to fall into place. “And how do we find out?”

“We’d fail to reject the null hypothesis,” Arvind explained succinctly. “But if it’s markedly different, we might have something worth investigating further.”

Ryka nodded, the concept beginning to crystallize. “So, it’s all about interpreting the data to see if there’s a meaningful pattern.”

“Exactly,” Arvind affirmed, a hint of satisfaction in his voice. “It’s about using statistics to uncover the truth hidden in the numbers.”

Arvind’s gaze turned to the tools of his trade. “With numbers, my dear,” he said, his voice low. “We’ll let the data speak for itself. And if there’s a story to tell, we’ll uncover it together.”

In their modest home, Arvind sat hunched over his laptop, the glow of the screencasting long shadows in the dimly lit room. Ryka, his daughter, stood nearby, her presence a silent but eager witness to her father’s digital pursuit.

“Daddy, what’s all this about?” Ryka’s voice cut through the quiet, her curiosity piqued by the intensity of Arvind’s focus.

“Just downloading the dataset,” Arvind replied tersely, his fingers dancing across the keyboard with practiced efficiency. “April temperatures, ’90 to ‘22.”

Ryka nodded, her eyes fixed on the screen as Arvind navigated the web interface with swift precision. The dataset materialized before them, a grid of numbers and dates filling the digital canvas.

With a decisive click, Arvind initiated the download, the silent dance of data flowing into his hard drive. Ryka leaned in closer, her gaze fixed on the unfolding tableau of information, each pixel a revelation waiting to be deciphered.

“Got it,” Arvind muttered, a brief flicker of satisfaction crossing his otherwise stoic expression. “Time to crunch some numbers.”

With a deft stroke, Arvind using Python filtered the DataFrame, isolating April temperatures from the sea of data. Missing values vanished into the digital ether, leaving behind a pristine collection of numerical artifacts. Ryka watched, her gaze following Arvind’s every move, as he meticulously prepared the dataset for analysis.

import matplotlib.pyplot as plt
import pandas as pd
from scipy.stats import shapiro, mannwhitneyu

# Load data from the CSV file into a DataFrame
df = pd.read_csv('Temprature_Data_1990_2022_BangaloreCity.csv')

# Convert the 'time' column to datetime format
df['time'] = pd.to_datetime(df['time'], format='%d-%m-%Y')

# Filter the DataFrame to include only April temperatures and remove NaN values
april_temperatures = df[df['time'].dt.month == 4]['tmax'].dropna()

Arvind’s gaze lingered on the screen, his fingers poised above the keyboard. The rhythmic tap of keys echoed in the dimly lit room as he delved into the heart of the data. Ryka, perched nearby, watched with keen interest, her youthful curiosity piqued by the solemnity of the moment.

# Perform Shapiro-Wilk test for normality
stat_shapiro, p_shapiro = shapiro(april_temperatures)

As the results unfolded before them, Arvind’s brow furrowed in contemplation. The Shapiro-Wilk test, a silent arbiter of truth, cast its judgment with unwavering certainty. With a statistic of 0.9632 and a p-value of 0.0000, the verdict was clear — the data was not normally distributed. “It seems,” Arvind remarked, “we’re dealing with non-normal distribution here.”

Figure 1: Shapiro-Wilk Test Results

“Daddy,” Ryka inquired, “what does normality in data imply and what does its absence imply?” Arvind pondered for a moment, his brow furrowing slightly. “Well, sweetheart,” he began, “if the data had shown a normal distribution, our approach would have been different. We could have employed parametric tests, such as the t-test or ANOVA, to compare means or variances between groups.” Ryka nodded, absorbing the information. “So, the type of test we choose depends on the characteristics of our data?”

“Exactly,” Arvind confirmed with a nod. “It’s crucial to select the appropriate statistical method based on the nature of the data and the hypothesis being tested. That’s the essence of sound statistical analysis. Parametric tests, like the ones I mentioned, assume that the data follows a specific distribution, such as the normal distribution. They provide powerful tools for inference when these assumptions are met.”

Ryka, ever perceptive, sensed the weight of her father’s discovery. The room hung heavy with the implications of their findings, the air thick with anticipation.

Arvind’s voice cut through the silence, his words laden with purpose. “Not to worry, Ryka,” he reassured, his tone unwavering. “We have an alternative.”

With a decisive keystroke, Arvind propelled their analysis forward. The latest April temperature observation, a solitary figure in a sea of data, loomed large at 37 degrees Celsius. Ryka’s eyes remained fixed on the screen, her anticipation palpable as Arvind initiated the Mann-Whitney U test with practiced precision.

For a moment, the room hung in suspended animation, the soft glow of the computer screencasting long shadows across the walls. Ryka held her breath, as the test unfolded before them.

As the screen flickered with results, Arvind’s demeanor remained steadfast, his gaze penetrating the data with unyielding resolve. “At a 5% significance level,” he declared with measured certainty, “we fail to toss aside the null hypothesis.” His voice, like granite, betrayed no hint of hesitation. “With a p-value score of 0.0988,” he continued, “there’s no real gap between the latest April 2024 temperature and Bangalore’s historical April temperatures.”

Ryka let out a sigh of relief, the tension dissolving like morning mist. She nodded at her father, gratitude filling her. Together, they had weathered uncertainty’s storm, emerging unscathed.

# Latest temperature observation in April 2024
latest_observation = 37

# Perform Mann-Whitney U test to compare the latest observation with historical April temperatures
stat_mannwhitneyu, p_mannwhitneyu = mannwhitneyu(april_temperatures, [latest_observation])

The Mann-Whitney U test had spoken, its p-value echoing through the room like a whisper of validation, affirming their journey through the labyrinth of statistical analysis.

Figure 2: Mann-Whitney U Test Results

“Daddy,” Ryka queried, “why did we opt for the Mann-Whitney test instead?”

Arvind leaned back, considering Ryka’s question thoughtfully. “Well, my dear,” he began, “the Mann-Whitney U test is a non-parametric test. Unlike parametric tests, it doesn’t assume a specific distribution of the data. It’s more robust and applicable in situations where the data deviates from normality, like ours did.”

Ryka nodded, absorbing the explanation. “So, it’s like a versatile tool that can handle different types of data?”

“Exactly,” Arvind affirmed with a smile. “The Mann-Whitney test allows us to compare the distributions of two groups without making assumptions about the underlying distribution. It’s a valuable tool in our statistical toolkit, especially when dealing with real-world data that may not always conform to idealized distributions.”

“Daddy, the test says April’s temp isn’t much. What’s that mean?” Ryka asked. Arvind chuckled, “Means, kiddo, this year’s temp isn’t special.”

“Ah, so it’s no big deal?” Ryka queried. Arvind explained, “Well, it’s like fishing in a pond. Got a fish, but not the big one you wanted.”

“Got it, Dad. So, we need more than just the test?” Ryka wondered. “Exactly, dear. We need the whole story,” Arvind confirmed. “Not just bits and pieces.”

In the quietude of their Bangalore abode, Ryka, stirred by their statistical voyage, felt an insatiable urge to plunge deeper into these enigmatic tests, yearning to unearth their veiled mysteries.

As nightfall descended, the cool zephyr whispered relief from the day’s fervor. Father Arvind and Ryka, seated side by side, witnessed a kaleidoscope of reflections dancing across their minds, mirroring the intricate tapestry woven by statistical theory. The scorching blaze of the day had kindled a pilgrimage into the essence of statistics, unfurling the cadences pulsating beneath the veneer of raw data.

Amidst the tranquil dusk, they discovered not only sanctuary from the oppressive heat but also a shared enchantment with the harmony bestowed by statistical tenets upon their comprehension of the cosmos. What once loomed as an arduous summer’s day now emerged as a fertile canvas for their expedition, binding them together in awe and inquisitiveness.

Yet, as they nestled into the serenity, an abrupt intrusion shattered the tranquility. A news bulletin flashed upon Arvind’s mobile phone screen, injecting a spark of anticipation into the dimly illuminated chamber. “Rain forecasted in Bengaluru,” it heralded, a pledge of succor for the parched terrain and a testament to nature’s capricious allure.

  1. #Python
  2. #Statistics
  3. #MachineLearning
  4. #DataVisualization
  5. #DataScience
  6. #DataAnalysis
  7. #StatisticalAnalysis
  8. #DataInsights
  9. #DataStorytelling
  10. #DataAnalytics
  11. #DataResearch
  12. #StatisticalMethods
  13. #DataExploration
  14. #ResearchInsights
  15. #StatisticalModeling
  16. #DataDrivenDiscovery
  17. #ShortStory

--

--

Rajesh R.
ILLUMINATION

Engineer, Ph.D. Scholar, and writer. I blend technical expertise and storytelling to explore science and creativity. Happy reading!