Back to Basics in ML

Essential NumPy Data Types: A Must-Know Guide

A Complete Refresher on Numpy DataType for Data Science Enthusiasts

Priti Oli
3 min readDec 22, 2023

NumPy extends the range of available numerical types well beyond native Python (data type in Python: strings, integer, float, boolean, complex…..). In this section, we’ll explore the diverse data types supported by NumPy and learn how to effectively modify the data type of an array to suit specific needs.

  • boolean '?' Boolean (True or False) stored as a byte
  • (signed) byte 'b' or int8 integers in the range -128 to 127
  • unsigned byte 'B' or uint8 unsigned integers in the range 0 to 255
  • (signed) integer 'i' or int32 32-bit signed integers
  • unsigned integer 'u' or uint32 32-bit unsigned integers
  • floating-point 'f' or float64 64-bit floating-point numbers
  • complex-floating point 'c' or complex128 complex numbers with 128-bit precision
  • timedelta 'm' or timedelta64 differences between two dates or times
  • datetime 'M' or datetime64 dates and times with 64-bit precision
  • (Python) objects 'O' or object fixed-length Unicode strings.
  • Unicode string 'U' or str_ fixed-length string (Unicode string)
  • raw data 'V' or void : useful for a structured array

A solid understanding of NumPy dtypes empowers practitioners to make informed decisions, ensuring efficiency and precision across a spectrum of scientific computing tasks.

Python’s default floating-point numbers are typically 64-bit, equivalent to np.float64, but in cases where extra precision is needed, higher precision floating-point numbers can be employed.

Numpy Data Type according to the different bit length

Integer Types:

  • int8, int16, int32, int64: Signed integers with varying bit lengths.
  • uint8, uint16, uint32, uint64: Unsigned integers.

Floating-Point Numbers:

  • float16, float32, float64: Precision-adjustable floating-point representations.

Complex Numbers:

  • complex64, complex128: Complex number data types.

How to check datatype in a NumPy array?

You can check the data type of a NumPy array using the dtype attribute. Here's a simple example:

import numpy as np

# Create a NumPy array for integer
my_array = np.array([1, 2, 3, 4, 5])

# Check the data type of the array
print(my_array.dtype) # int 64

# Create a NumPy array for integer
fruits = np.array(['apple', 'bananas', 'cherry'])

# Check the data type of the array
print(fruits.dtype)

How to create numpy arrays with defined datatype?

You can create NumPy arrays with defined data types using the dtype parameter during array creation. Here are examples with different data types:

import numpy as np

# Create an integer array with the specified data type (int32)
uint_array = np.array([1, 2, 3], dtype=np.uint32)

# Create an integer array with the specified data type (int32)
int_array = np.array([1,0,-1, -2, -3], dtype=np.int64)

# Create a floating-point array with the specified data type (float64)
float_array = np.array([1.0, 2.5, 3.7], dtype=np.float64)

# Create a boolean array with specified data type (bool)
bool_array = np.array([True, False, True], dtype=np.bool)

complex_array = np.array([1 + 2j, 3 - 4j], dtype=np.complex128)

print(uint_array) #[1 2 3]
print(int_array) #[ 1 0 -1 -2 -3]
print(float_array #[1. 2.5 3.7]
print(bool_array) #[ True False True]
print(complex_array) #[1.+2.j 3.-4.j]

Note: If the type of passed argument to a function is unexpected/incorrect then aValueError is raised in NumPy. In the following example, trying to create a NumPy array with a mix of integers and a string ('three') raises a ValueError.

import numpy as np

data_input = [1, 2, 'three', 4]

# Attempt to create a NumPy array with the provided data
result_array = np.array(data_input, dtype=np.int64) # Raises ValueError

How to convert from one datatype to another in a numpy array

To change the data type of an existing NumPy array, use the astype() method. This creates a copy of the array, and you can specify the desired data type. Note that in the example below the original numpy array with int64 (In NumPy, the default data type for integers is platform-dependent. On most modern systems, the default data type for integers in NumPy is int64 )is converted to float64 when specifying float (Python’s default floating-point numbers are typically 64-bit). When converting it to dtype ‘i’ it is converted to int32 array.

import numpy as np

original_array = np.array([1, 2, 3])
new_array = original_array.astype(float)
old_array = new_array.astype('i')

print(original_array.dtype) # int64
print(new_array.dtype) # float64
print(old_array.dtype) # int32

--

--

Priti Oli

Computer Science Ph.D. student exploring the frontiers of Artificial Intelligence and Human Computer Interaction