A comprehensive guide to get you started with NumPy

Learn the basics of NumPy through this guide.

Devansh Sheth
Nerd For Tech
6 min readMar 10, 2023

--

Photo by Ali Yılmaz on Unsplash

What is numpy and why use it?

NumPy stands for Numerical Python and it is a python library to work numerical data in python efficiently. It supports large, multi-dimensional arrays and matrices. It also supports a large collection of mathematical functions to efficiently operate on these arrays. Pandas, scikit-learn, seaborn and many other data science libraries use NumPy.

NumPy allows single and multi-dimensional arrays and matrices. Unlike a python list, all the elements of the array needs to be of same data type, referred to as dtype of array. Due to this, it allows the code to be optimized and it also uses much less memory.

How to install numpy

If you don’t have python, I suggest using Anaconda, which is a python distribution. If you use Anaconda, there is no need to worry about installing numpy, pandas and other data science libraries as they are already there. There is also miniconda, if you are running low on resources.

Installing NumPy using Anaconda (if its not there due to some reason):

conda install numpy

Installing NumPy using pip:

pip install numpy

Creating numpy array

To create a numpy array, we can use np.array() function. We need to specify a list in the function to create the array. An existing python list can also be passed to the function.

import numpy as np
a = np.array([1, 3, 5])
b = [2, 4, 6]
c = np.array(b)

Now, we will look at the built-in methods to create NumPy array.

First of which is np.arange(). Using this, we can create a numpy array of the elements included in the specified range. We need to pass start, stop and step to arange and it works similar to python range function.

>>> np.arange(6) # if not specified, step is assumed to be 1.
array([0,1,2,3,4,5])
>>> np.arange(0,6) 
array([0,1,2,3,4,5])
>>> np.arange(0,6,2)
array([0,2,4])

To create an array filled with zeros or ones, we have np.zeros() and np.ones(). We only need to pass the size of the array we want.

>>> np.zeros(5)
array([0., 0., 0., 0., 0.])
>>> np.ones(5)
array([1., 1., 1., 1., 1.])

To create an array of elements linearly spaced in the specified range. We need to pass start, stop and the number of elements. The only thing to keep in mind is that the stop is included in the range.

>>> np.linspace(0, 10, 10)
array([ 0. , 1.11111111, 2.22222222, 3.33333333, 4.44444444,
5.55555556, 6.66666667, 7.77777778, 8.88888889, 10. ])

Array Attributes in Numpy

Shape of Numpy array

ndarray.shape displays the number of elements along each dimension of the array in form of a tuple.

>>> a = np.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]])
>>> a.shape
(3,4)

This shows that there are 3 rows and 4 columns.

We can also reshape the array using shape.

>>> a.shape = (4,3)
>>> print(a)
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])

Finding the number of dimensions of the array.

>>> arr1 = np.arange(12)
>>> arr1.ndim
1
# After reshaping it along 2 dimensions
>>> arr1.shape = (3,4)
>>> arr1.ndim
2

ndarray.size gives the total number of elements in the array.

>>> arr1.size
12

Finding the datatype of the elements in the array.

>>> arr1.dtype
dtype('int64')

Indexing and Slicing

Indexing

Indexing is used to select single or multiple elements from the array based on the specified index or range of indexes.

# Following examples show how to do it in 1D.
>>> a = np.arange(12)
>>> print(a)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> a[3]
3

Now let’s see some examples to understand indexing in 2D.

# Examples for 2D.
>>> a = np.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]])
>>> a[1]
array([4, 5, 6, 7])
# if we want to access a specific element from a row, 
# we need to do it like following
>>> a[1,1]
5
# We can also use 2 square brackets for row and column.
>>> a[1][1]
5

Slicing

Slicing helps us get parts of the array based on the index we pass. We need to pass start index, end index and the step like this: [start : end : step]. If step is not specified, it is assumed as 1. If start is not mentioned, it is considered 0 and if end is not mentioned, it is considered -1.

# Slicing in 1D.
>>> a = np.arange(12)
>>> a[6:9]
array([ 6, 7, 8])
>>>a[6:]
array([ 6, 7, 8, 9, 10, 11])

Slicing in 2D requires us to give start, end and step for both rows and columns. If it is mentioned for only rows then it is considered that all columns are asked and vice versa if only columns are mentioned. It follows the format like this: [rows, columns].

# Slicing in 2D.
>>> a = np.array([[0,1,2,3],[4,5,6,7],[8,9,10,11]])
>>> a[1:] # This will give us rows 1 and 2 and all columns.
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
# Now we will get, rows 1 and 2 and column 0.
>>> a[1:, 0]
array([4, 8])
# Now let's get rows 1 and 2 and columns 1 and 2.
>>> a[1:, 1:3]
array([[ 5, 6],
[ 9, 10]])
# Now let's get alternate rows and columns.
>>> a[::2, ::2]
array([[ 0, 2],
[ 8, 10]])

Filtering

Filtering helps us to get the values from the array which satisfy a predefined condition.

Let’s say we have integers from 1 to 10 and we want all the integers divisible by 3. We can create a filter for this from the array itself and use that filter to get the values we want.

>>> a = np.arange(1,11)
>>> filter_a = a % 3 == 0
# This filter array that we created can be used on the array a 
# itself and that gives us the the required values.
>>> a[filter_a]
array([3, 6, 9])
# We can also directly use the filter condition on the array
# without having to store it in another array.
>>> a[a%3 == 0]
array([3, 6, 9])

Now, we will see how to filter using multiple conditions.

# We will create one more filter to use along with the 
# filter we created earlier i.e., filter_a.
>>> filter_a1 = a > 5
# Now we will use this filter along with filter_a
>>> a[filter_a & filter_a1]
array([6, 9])
# We can also use this directly like shown above.
>>> a[(a%3 == 0) & (a > 5)]
array([6, 9])

Arithmetic operations on NumPy array

We can perform arithmetic operations of numpy arrays with scalar values or with other numpy arrays of same shape in any one dimension.

>>> a = np.array([1, 2, 3])
>>> b = 3
>>> a+b
array([4, 5, 6])

Here value of b is added to every element of array a.

>>> a*b
array([3, 6, 9])

Here every element of array a is multiplied by the value of b.

We can also perform arithmetic operation with another numpy array.

>>> c = np.array([4, 5, 6])
>>> a+c
array([5, 7, 9])
>>> a*c
array([ 4, 10, 18])

In the above examples, the arithmetic operation is performed element-wise. So the first element of array a is operated with first element of array b and so on. That is why it is important to have both the numpy arrays of same shape.

Now, let’s understand how this works in 2D.

>>> a1 = np.array([[1,2,3],[4,5,6]])
>>> b = 3
>>> a1+b
array([[4, 5, 6],
[7, 8, 9]])
>>> a1*b
array([[ 3, 6, 9],
[12, 15, 18]])
# The value of b is operated with all the values of the array a.
# First, let's see our arrays a1 and c and then perform operations on them.
>>> a1
array([[1, 2, 3],
[4, 5, 6]])
>>> c
array([4, 5, 6])

>>> a1+c
array([[ 5, 7, 9],
[ 8, 10, 12]])
# So elements of column 1 of array a1 are operated with the column 1 of array c and so on. Multiplication is also done the same way.
>>> a1*c
array([[ 4, 10, 18],
[16, 25, 36]])

Thanks for reading!

In the next article, I will explain about the important functions of numpy and how to use them. Make sure to follow to read upcoming articles on NumPy, Pandas, SQL and all things related to Data Science.

Connect with me: LinkedIn
Check out my other projects:
Github

--

--

Devansh Sheth
Nerd For Tech

Data Scientist | Sharing everything I learn about python, SQL and Machine Learning