numpy.isnan in Python

Amit Yadav

--

If you think you need to spend $2,000 on a 180-day program to become a data scientist, then listen to me for a minute.

I understand that learning data science can be really challenging, especially when you’re just starting out, because you don’t know what you need to know.

But it doesn’t have to be this way.

That’s why I spent weeks creating the perfect roadmap to help you land your first data science job.

Here’s what it contains:

  1. A structured 42 weeks roadmap with study resources
  2. 30+ practice problems for each topic
  3. A discord community
  4. A resources hub that contains:
  • Free-to-read books
  • YouTube channels for data scientists
  • Free courses
  • Top GitHub repositories
  • Free APIs
  • List of data science communities to join
  • Project ideas
  • And much more…

If this sounds exciting, you can grab it right now by clicking here.

Now let’s get back to the blog:

1. Understanding numpy.isnan: What It Does

Have you ever been told, “You can’t find what you don’t know exists?”

When working with data, this couldn’t be more accurate — especially when dealing with missing values. numpy.isnan is here to save the day.

It identifies NaN (Not a Number) values in your NumPy arrays, returning a Boolean array that pinpoints their exact location.

Think of it as a detective that highlights all the “missing persons” in your dataset. Let’s break this down step by step:

  1. What is NaN?
    NaN stands for "Not a Number." It often represents missing or undefined values in your data.
  2. How does numpy.isnan work?
    It scans through your NumPy array and checks each element. If it encounters a NaN, it returns True for that position; otherwise, it returns False.

Here’s a simple example to help you see it in action:

import numpy as np

# Example array with NaN values
arr = np.array([1, 2, np.nan, 4, np.nan, 6])

# Using numpy.isnan to detect NaN values
result = np.isnan(arr)
print("Boolean Array:", result)

When you run this code, here’s what you’ll see:

Boolean Array: [False False  True False  True False]

What’s happening here?

  • Each True represents a NaN in the original array.
  • For example, the True at index 2 corresponds to the NaN at the same position in arr.

This might seem like a small trick, but trust me, once you start dealing with messy data, numpy.isnan becomes one of your best friends.

2. Practical Applications of numpy.isnan

“Knowing is not enough; we must apply.” This couldn’t be truer when working with numpy.isnan.

You’ve learned what it does, but now, let’s put it to work in real-world scenarios.

Whether you’re cleaning messy data or counting missing values, this function is a go-to tool in any programmer’s toolkit.

Replacing NaN Values

Imagine you’re working on a dataset, and missing values are messing with your calculations. What can you do? Replace them! Here’s how:

import numpy as np

# Example array with NaN values
arr = np.array([1, 2, np.nan, 4, np.nan, 6])

# Replace NaN values with a specific number, e.g., 0
arr_cleaned = np.where(np.isnan(arr), 0, arr)
print("Array after replacing NaN values:", arr_cleaned)

Output:

Array after replacing NaN values: [1. 2. 0. 4. 0. 6.]

What’s happening here?

  • np.where checks each element in the array.
  • If numpy.isnan returns True, it replaces the value with 0 (or any number you specify).
    This step is invaluable for maintaining consistency in your data.

Counting NaN Values

You might be wondering: “How many NaN values are lurking in my dataset?" Counting them is quick and easy:

# Count total NaN values in the array
total_nan = np.sum(np.isnan(arr))
print("Total NaN values:", total_nan)

Output:

Total NaN values: 2

This might surprise you: Knowing the number of missing values is critical when deciding how to handle them. It can influence decisions like whether to clean or drop entire rows in larger datasets.

Removing NaN Values

Sometimes, you may want to ditch NaN values entirely. Here’s how you can filter them out:

# Remove NaN values using Boolean indexing
arr_no_nan = arr[~np.isnan(arr)]
print("Array without NaN values:", arr_no_nan)

Output:

Array without NaN values: [1. 2. 4. 6.]

Why is this useful?
This technique is perfect for when you need clean, ready-to-use data without any placeholder values.

These practical examples are just the tip of the iceberg. You’ll find yourself reaching for numpy.isnan whenever missing data throws a wrench in your workflows.

3. FAQs

You’ve mastered the basics of numpy.isnan, but I know there are still a few lingering questions.

Let’s tackle them one by one and ensure you leave with no doubts!

Q: Can numpy.isnan handle non-numeric data?
This might disappoint you: No, numpy.isnan cannot handle non-numeric data. It’s specifically designed for numeric types like integers and floats.

If you try to use it on strings or mixed-type arrays, it will raise an error faster than you can say "debugging nightmare."

If you’re ever unsure, stick with purely numeric arrays. Here’s a quick example to illustrate:

import numpy as np

# Mixed-type array (will raise an error)
arr = np.array([1, 2, 'a', np.nan])

# Attempting to use numpy.isnan
try:
result = np.isnan(arr)
except TypeError as e:
print("Error:", e)

Output:

Error: ufunc 'isnan' not supported for the input types

Q: How do I remove NaN values from an array?
If you’ve been waiting for the simplest way to clean up your data, here it is. Use Boolean indexing to filter out all the NaN values. This method is efficient, elegant, and gets the job done.

Here’s an example:

# Example array with NaN values
arr = np.array([1, 2, np.nan, 4, np.nan, 6])

# Remove NaN values using Boolean indexing
arr_no_nan = arr[~np.isnan(arr)]
print("Array without NaN values:", arr_no_nan)

Output:

Array without NaN values: [1. 2. 4. 6.]

You might be thinking, “That’s it?” Yes! Cleaning data can be this simple.

Q: What’s the difference between numpy.isnan and pandas.isna?
This might surprise you: While they seem similar, numpy.isnan and pandas.isna serve different ecosystems.

  • numpy.isnan: Works exclusively with NumPy arrays and focuses on numeric data.
  • pandas.isna: Designed for Pandas objects like Series and DataFrames. It can handle mixed data types and is more versatile for tabular data.

Here’s a quick comparison:

import pandas as pd

# Example DataFrame
df = pd.DataFrame({
'A': [1, 2, None],
'B': [np.nan, 4, 5]
})

# Using pandas.isna
print("Using pandas.isna:\n", pd.isna(df))

Output:

Using pandas.isna:
A B
0 False True
1 False False
2 True False

If you’re working with Pandas DataFrames, stick with pandas.isna—it’s built for the job.

--

--

No responses yet