Master NumPy

Om Rastogi
Analytics Vidhya
Published in
5 min readDec 23, 2019

--

A lot of you must be preaching for machine learning, AI and Computer Vision etc and you stumbled across NumPy. Others just want to learn this for computation.

What is NumPy?

NumPy stands for Numerical Python is a library for python, which is used to tackle large arrays and multi-dimensional matrices. It also provides many high-level mathematical functions to start with basic trigonometric functions to very complicated Fourier transform.

Why NumPy?

It is almost to impossible to create multi-dimensional arrays using list in python. Additionally NumPy is much faster in solving huge mathematical problems than traditional way. Actually NumPy is coded in both python and C, which can be listed as a reason that, it is fast. NumPy is actually fast due to fixed type and contiguous mapping of memory. In laymen language a single value is not given so much respect as in a list. In a list there even one value has properties like size, reference count, object type and object value.

Contagious memory mapping

You may not appreciate the use of NumPy, if you are a beginner. But when you are on something and things get messy with task like processing an array of 44 thousand element through multiple stages, NumPy comes really handy. Suddenly this is the best library you’ve know so far.

You can install NumPy using pip install function. Make sure that you have internet and do this-

1. Open cmd

2. Write pip install NumPy

(I have already installed numpy, so the requirement is already satisfied)

How to master NumPy in 45 Mins?

Check this Youtube playlist on numpy.

To master anything, the practice is the key. However to learn in least time, you need learn smart. Here is two smart tried and tested strategies to learn very fast:

  1. Keep a cheat-sheet, there are many functions and operations in this library, remembering them is very inefficient. So keep a datasheet. Here is a link to an awesome cheat-sheet by DataCamp. https://www.datacamp.com/community/blog/python-numpy-cheat-sheet
CheatSheet by DataCamp
  1. Second strategy is Smart Practice. No one can master anything without practice, but smart practice decreases the time and effort. Running main functions take about 5 to 10 minutes (you just have to copy paste).

a. First install numpy and try most of the operations in the datasheet mention above. So that you get an idea of basic functions of numpy.

b. Now when you have all the basic experience, go on to solve problems. I have hand-picked some problems for you, going from easy to difficult. These problems are given with solutions, so you can go though. These question merely take 30 to 35 minutes.

Exercises:

  1. Create a Boolean array?
import numpy as np
arr = np.ones((3,3), dtype=bool)
print (arr)

2. Create a array with all element equal zero.

import numpy as np 
arr = np.zeros((2,2,2))
print (arr)

3. Write a NumPy program to create a 3x3 identity matrix

import numpy as np
x = np.eye(3) #'eye' sounds similar to 'I'
print (x)

4. Write a NumPy program to convert numpy dtypes to native python types.

import numpy as np
x = np.float32(0) # This creates an np type float32
print (type(x))
pyval = x.item()
print (type(pyval))

5. Write a NumPy program to reverse an array

import numpy as np
x = [ 1,2,3,4,5,6,7,8,9]
x = x[::-1]
print (x)

6. Write a NumPy program to compute sum of all elements, sum of each column and sum of each row of an given array.

import numpy as np
x = np.array([[0,1],[2,3]])
print("Original array:",x)
print("Sum of all elements:",np.sum(x))
print("Sum of each column:",np.sum(x, axis=0))
print("Sum of each row:",np.sum(x, axis=1))

7. How to extract a particular column from 1D array of str?

import numpy as np
url = ‘https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' #url returns a str
iris_1d = np.genfromtxt(url, delimiter=’,’, dtype=None)
print(iris_1d.shape)

8. Write a NumPy program to compute the cross product of two given vectors.

import numpy as np
p = [[1, 0], [0, 1]]
q = [[1, 2], [3, 4]]
result1 = np.cross(p, q)
result2 = np.cross(q, p)
print("cross product of the said two vectors(p, q):",result1)print("cross product of the said two vectors(q, p):",result2)

9. Write a NumPy program compute the inverse of a given matrix.

import numpy as np
m = np.array([[1,2],[3,4]])
print("Original matrix:",m)
result = np.linalg.inv(m)
print("Inverse of the said matrix:",result)

10. Write a NumPy program to sort a given array of shape 2 along the first axis, last axis and on flattened array

import numpy as np
a = np.array([[10,40],[30,20]])
print("Original array:")
print(a)
print("Sort the array along the first axis:")
print(np.sort(a, axis=0))
print("Sort the array along the last axis:")
print(np.sort(a))
print("Sort the flattened array:")
print(np.sort(a, axis=None))

11. How to do probabilistic sampling in numpy?

import numpy as np
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
iris = np.genfromtxt(url, delimiter=',', dtype='object')# Solution# Get the species columnspecies = iris[:, 4]# Approach 1: Generate Probablistically
np.random.seed(100)
a = np.array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'])
species_out = np.random.choice(a, 150, p=[0.5, 0.25, 0.25])
# Approach 2: Probablistic Sampling (preferred)
np.random.seed(100)
probs = np.r_[np.linspace(0, 0.500, num=50), np.linspace(0.501, .750, num=50), np.linspace(.751, 1.0, num=50)]
index = np.searchsorted(probs, np.random.random(150))
species_out = species[index]
print(np.unique(species_out, return_counts=True))

12. How to find the most frequent value in a numpy array?

#for lst is the numpy array
vals, counts = np.unique(lst[:, 2], return_counts=True)

13. How to replace all values greater than a given value to a given cutoff?

import numpy as np
np.random.seed(100)
a = np.random.uniform(1,50, 20)
# Solution 1: Using np.clip
np.clip(a, a_min=10, a_max=30)
# Solution 2: Using np.where
print(np.where(a < 10, 10, np.where(a > 30, 30, a)))

14. How to compute the min-by-max for each row for a numpy array 2d?

import numpy as np
a = np.random.seed(100)
a = np.random.randint(1,10, [5,3])
np.apply_along_axis(lambda x: np.min(x)/np.max(x), arr=a, axis=1)

15. Write a NumPy program to create a 2d array with 1 on the border and 0 inside of size n*n.

import numpy as np
n=4
x = np.ones((n,n))
print("Original array:",x)
print("1 on the border and 0 inside in the array")
x[1:-1,1:-1] = 0
print(x)

--

--

Om Rastogi
Analytics Vidhya

I believe in an altruistic world, where creativity and imagination replace repetitive work