**100 Days of Data Science: Day 3 — Unmasking Data’s Intricacies**

Murad Pitafi
2 min readAug 17, 2023

--

📊 Welcome back to Day 3 of my exhilarating 100 Days of Data Science journey! Today, we’re embarking on a thrilling quest to unmask the hidden intricacies of our dataset. Join me as we explore the power of `df.isnull()`, `columns`, and duplicates, shedding light on data’s enigmatic facets.

**Unveiling the Unknown: Harnessing `df.isnull()`, `columns`, and Duplicates**

In the captivating realm of data science, understanding the nuances of your dataset is like deciphering an intricate code. Today, we’ve delved into the heart of data intricacies, wielding three potent tools — `df.isnull()`, `columns`, and the detection of duplicates — to illuminate the veiled truths within our data.

🔍 **`df.isnull()`: Spotlight on Missing Values**
The `df.isnull()` function serves as a guiding beacon, revealing the presence of missing values within our dataset. Every NaN signifies a gap in the story — a piece waiting to be unraveled. This function empowers us to pinpoint where our data’s narrative might be incomplete.

📋 **`columns`: A Glimpse of Feature Universe**
The `columns` attribute is a window to our dataset’s feature space. With a single command, we unveil the names of all features, each holding a unique story. It’s the first step in decoding data’s language — understanding what each column brings to the table.

🔑 **Duplicates: Unraveling Repetition’s Tale**
Duplication can hide insights in plain sight. By identifying and addressing duplicates, we unlock a clearer understanding of our data. It’s akin to unraveling echoes of information, allowing us to distinguish between true patterns and repetitions.

Let’s watch these tools in action:

```python
# Unmasking data intricacies with df.isnull(), columns, and duplicates
print(df.isnull().sum())
print(df.columns)
print(df.duplicated().sum())
```

Through these tools, I’ve begun to unravel the layers of my dataset, exposing gaps, understanding features, and disentangling repetitions. Every NaN, every column, every duplicate — they’re all threads that weave the tapestry of data insights.

🧩 **The Journey Continues…**
Our exploration is an ongoing saga, each day revealing new layers of data’s mystery. Tomorrow, we embark on yet another chapter, delving deeper into the labyrinth of data science. As always, your thoughts, queries, and discoveries are welcome on this voyage of knowledge.

Join me as we unveil the captivating stories hidden within our data and journey toward data mastery!

#100DaysOfDataScience #DataIntricacies #DataAnalysis #PythonDataScience #Pandas #DataInsights #LearningInProgress #DataJourney

Connect with me on twitter: @muradpitafi1

--

--