# 100 Days of DevOps — Day 94-Introduction to Numpy for Data Analysis

May 16 · 6 min read

Welcome to Day 94 of 100 Days of DevOps, Focus for today is Introduction to Numpy for Data Analysis

NumPy is a Linear Algebra Library for Python and the reason it’s so important that all libraries in the PyData Ecosystem rely on NumPy as the main building block.

Installing Numpy

`# pip2 install numpyCollecting numpyUsing cached numpy-1.12.1-cp27-cp27mu-manylinux1_x86_64.whlInstalling collected packages: numpySuccessfully installed numpy-1.12.1`

It’s highly recommended to install Python using Anaconda distribution to make sure all underlying dependencies(such as Linear Algebra libraries)all sync up with the use of a conda install.

`conda install numpy`

Numpy arrays are the main reason we use Numpy and they come in two flavors

• Vectors (1-d arrays)
• Matrices (2-d arrays)
`# 1-D Array>>> test = [1,2,3]>>> import numpy as np# We got the array>>> np.array(test)array([1, 2, 3])>>> arr = np.array(test)`

Let’s take a look at 2-D array

`>>> test1 = [[1,2,3],[4,5,6],[7,8,9]]>>> test1[[1, 2, 3], [4, 5, 6], [7, 8, 9]]>>> np.array(test1)array([[1, 2, 3],[4, 5, 6],[7, 8, 9]])`

But the most common way to generate NumPy array is using arange function(similar to range in Python)

`#Similar to range(start,stop,step),stop not included and indexing start with zero >>> np.arange(0,10)array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])`

But if we are looking for a specific type of arrays

`>>> np.zeros(3)array([ 0., 0., 0.])#We are passing Tuple where first value represent row and second represent column>>> np.zeros((3,2))array([[ 0., 0.],[ 0., 0.],[ 0., 0.]])`

Similarly for ones

`>>> np.ones(2)array([ 1., 1.])>>> np.ones((2,2))array([[ 1., 1.],[ 1., 1.]])`

Now let’s take a look at linspace

`#It will give 9 evenly spaced point between 0 and 3(It return 1D vector)>>> np.linspace(0,3,9)array([ 0. , 0.375, 0.75 , 1.125, 1.5 , 1.875, 2.25 , 2.625, 3. ])>>> np.linspace(0,10,3)array([  0.,   5.,  10.])`

Let’s create an identity matrix(2-D square matrix where the number of rows is equal to the number of columns and diagonal of 1)

To create an array of random number

`#1-D, it create random sample uniformly distributed between 0 to 1>>> np.random.rand(3)array([ 0.87169008, 0.51446765, 0.65027072])#2-D>>> np.random.rand(3,3)array([[ 0.4217015 , 0.86314141, 0.14976093],[ 0.4348433 , 0.68860693, 0.88575823],[ 0.56613179, 0.56030069, 0.51783999]])`

Now if I want random integer

`#This will give random integer between 1 and 50>>> np.random.randint(1,50)27#In case if we need 10 random integer>>> np.random.randint(1,50,10)array([39, 34, 30, 21, 18, 30,  3,  6, 37, 11])`

We can reshape our existing array

`>>> np.arange(25)array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19, 20, 21, 22, 23, 24])>>> arr = np.arange(25)>>> arrarray([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19, 20, 21, 22, 23, 24])>>> arr.reshape(5,5)array([[ 0, 1, 2, 3, 4],[ 5, 6, 7, 8, 9],[10, 11, 12, 13, 14],[15, 16, 17, 18, 19],[20, 21, 22, 23, 24]])`

Let’s take a look at some other methods

`>>> np.random.randint(0,50,10)array([10, 40, 18, 30, 6, 40, 49, 23, 3, 18])>>> ranint = np.random.randint(0,50,10)>>> ranintarray([18, 49, 6, 28, 30, 10, 46, 11, 40, 16])#It will return max value of the array>>> ranint.max()49#Minimum value>>> ranint.min()6>>> ranintarray([18, 49,  6, 28, 30, 10, 46, 11, 40, 16])#To find out the position>>> ranint.argmin()2>>> ranint.argmax()1`

To find out the shape of an array

`>>> arr.shape(25,)>>> arr.reshape(5,5)array([[ 0, 1, 2, 3, 4],[ 5, 6, 7, 8, 9],[10, 11, 12, 13, 14],[15, 16, 17, 18, 19],[20, 21, 22, 23, 24]])>>> arr = arr.reshape(5,5)>>> arr.shape(5, 5)`

To find out the datatype

`>>> arr.dtypedtype(‘int64’)`

Indexing in case of NumPy

`>>> import numpy as np>>> arr =np.arange(0,11)>>> arrarray([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])>>> arr[0]0>>> arr[0:4]array([0, 1, 2, 3])`

How Numpy array is different from the Python list because of there ability to broadcast

`>>> arr[:] = 20>>> arrarray([20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20])# Now let's try to slice this array>>> arr1 = arr[0:5]>>> arr1array([0, 1, 2, 3, 4])>>> arr1[:] = 50>>> arr1array([50, 50, 50, 50, 50])#But as you can see the side effect it change the original array too(i.e data is not copied it's just the view of original array)>>> arrarray([50, 50, 50, 50, 50,  5,  6,  7,  8,  9, 10])#If we want to avoid this feature, we can copy the array and then perform broadcast on the top of it>>> arr2 = arr.copy()>>> arr2array([50, 50, 50, 50, 50,  5,  6,  7,  8,  9, 10])>>> arr2[6:10] = 100>>> arr2array([ 50,  50,  50,  50,  50,   5, 100, 100, 100, 100,  10])>>> arrarray([50, 50, 50, 50, 50,  5,  6,  7,  8,  9, 10])`

Indexing 2-D Array(Matrices)

`>>> arr = ([1,2,3],[4,5,6],[7,8,9])>>> arr([1, 2, 3], [4, 5, 6], [7, 8, 9])>>> arr1 = np.array(arr)>>> arr1array([[1, 2, 3],[4, 5, 6],[7, 8, 9]])>>> arr[1][4, 5, 6]# To grab 5(Indexing Start with zero)>>> arr1[1][1]5#Much shortcut method>>> arr1[1,1]5`

To grab elements from 2-D array

`>>> arrarray([[1, 2, 3],[4, 5, 6],[7, 8, 9]])#This will grab everything from Row 1 except last element(2) and staring from element 1 upto the end from Row 2>>> arr[:2,1:]array([[2, 3],[5, 6]])`

Conditional Selection: This will return a boolean value

`>>> arrarray([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])>>> arr > 5array([False, False, False, False, False, True, True, True, True, True], dtype=bool)# We can save this value to an array and perform boolean selection>>> my_arr = arr > 5>>> my_arrarray([False, False, False, False, False,  True,  True,  True,  True,  True], dtype=bool)>>> arr[my_arr]array([ 6,  7,  8,  9, 10])#OR much easier way>>> arr[arr > 5]array([ 6,  7,  8,  9, 10])>>> arr[arr < 5]array([1, 2, 3, 4])`

Operations

`# It's the same operation as we are doing with Normal Python>>> arr = np.arange(0,10)>>> arrarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])>>> arrarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])#Addition>>> arr + arrarray([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])#Substraction>>> arr — arrarray([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])#Multiplication>>> arr * arrarray([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])#Broadcast(It add's/substract/multiply 100 to each element)>>> arr + 100array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])>>> arr - 100array([-100,  -99,  -98,  -97,  -96,  -95,  -94,  -93,  -92,  -91])>>> arr * 100array([  0, 100, 200, 300, 400, 500, 600, 700, 800, 900])`

In case of Python if we try to divide one with zero we will get division by zero exception

`>>> 0/0Traceback (most recent call last):File "<stdin>", line 1, in <module>ZeroDivisionError: division by zeroOR>>> 1/0Traceback (most recent call last):File “<stdin>”, line 1, in <module>ZeroDivisionError: division by zero`

In case of Numpy if we try to divide by zero we will not get any exception but it returns nan(not a number)

`#Not giving you >>> arr/arr__main__:1: RuntimeWarning: invalid value encountered in true_dividearray([ nan, 1., 1., 1., 1., 1., 1., 1., 1., 1.])`

and in case of one divide by zero it will return infinity

`>>> 1/arrarray([ inf, 1. , 0.5 , 0.33333333, 0.25 ,0.2 , 0.16666667, 0.14285714, 0.125 , 0.11111111])`

Universal Array Function

`#Square root>>> np.sqrt(arr)array([ 0. , 1. , 1.41421356, 1.73205081, 2. ,2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ])#Exponential>>> np.exp(arr)array([ 1.00000000e+00, 2.71828183e+00, 7.38905610e+00,2.00855369e+01, 5.45981500e+01, 1.48413159e+02,4.03428793e+02, 1.09663316e+03, 2.98095799e+03,8.10308393e+03])#Maximum>>> np.max(arr)9#Minimum>>> np.min(arr)0#Logarithmic>>> np.log(arr)__main__:1: RuntimeWarning: divide by zero encountered in logarray([       -inf,  0.        ,  0.69314718,  1.09861229,  1.38629436,1.60943791,  1.79175947,  1.94591015,  2.07944154,  2.19722458])`

Looking forward from you guys to join this journey and spend a minimum an hour every day for the next 100 days on DevOps work and post your progress using any of the below medium.

Reference

Written by