NumPy for aspiring data scientists

Saroj Humagain
ml.careers
Published in
3 min readDec 21, 2019

Python is a go-to programming language for many developers because it has libraries for almost everything. Before getting into Data Science, Machine learning, or Deep learning we need several libraries. In this blog, we are going to talk about them NumPy.

NumPy

NumPy is a core library for scientific computing in Python. Basically, it is a general-purpose array processing package designed to efficiently manipulate large multi-dimensional arrays of arbitrary records without sacrificing too much speed for small multi-dimensional arrays.
[via https://pypi.org/project/numpy/]

Creating NumPy arrays :

import numpy as np
a = np.array[(9,4,6,7)] #single dimentional array
b = np.array[(1,2,3,4),(9,6,5,4)] #multidimentional array
print(a)
print(b)
output
[9,4,6,7]
[[1,2,3,4]
[9,6,5,4]]

But, why we prefer NumPy over the list? Because of the following reasons

1. Less memory: NumPy occupies less memory compared to the list. Let's prove our point, shall we? The following code shows the differences in memory occupation

And the output of the above code is 28000 and 8000. Check yourself and compare the result.

2. Fast: NumPy is pretty fast than the list. To prove this, let's do it with the example.

The output of the above code is 50.9343147277832
5.445718765258789. This result shows NumPy is way faster than the list.

3. Convenient: In the above code you can see, using NumPy library you can add two arrays by just using the add operator but in a list, you have to use a complicated loop to add every element.

Operations you can perform using NumPy library

  • Find the dimension fo the array:
import numpy as np
a = np.array([(2,3,4),(4,5,6)])
print(a.ndim)
output
2
  • Find the data size of each element
import numpy as np
a = np.array([(2,3,4),(4,5,6)])
print(a.itemsize)
output
8
  • Find the data type of the element
import numpy as np
a = np.array([(2,3,4),(4,5,6)])
print(a.dtype)
output
int64
  • Find the size of an array
import numpy as np
a = np.array([(2,3,4),(4,5,6)])
print(a.size)
output
6
  • Find the shape of an array
import numpy as np
a = np.array([(2,3,4),(4,5,6)])
print(a.shape)
output
(2,3)
  • Reshaping
import numpy as np
a = np.array([(2,3,4),(4,5,6)])
print(a.reshape(3,2))
output
([[2, 3],
[4, 4],
[5, 6]])
  • Slicing: We generally use slicing to retrieve a collection of values from the NumPy
import numpy as np
a = np.array([3, 5, 5, 3, 4, 5, 8, 23, 98])
print(a[3:6]) //slices data from index 3 to 6. ie. indeces 3,4 and 5 not 6. Here 3 is the starting index and 6 is stopping index
print(a[3:]) // slices data from index 3 to the very last index
print(a[:]) // gives all elements present in array
print(a[:5]) // slices data from the index 0 to index 4
print(a[-5]) // gives the fifth index, but should count from last, here -5 index is equals to 3
output
[3 4 5]
[3,4,5,8,23,98]
[3, 5, 5, 3, 4, 5, 8, 23, 98]
[3,5,5,3,4]
[3]

In a similar way, you can slice a 2-dimensional array.

  • Linspace: The following example shows the purpose of linspace.
import numpy as npprint(np.linspace(1,3,5)) #gives 5 numbers between 1 and 3.output
[1. 1.5 2. 2.5 3.]
  • Finding maximum, minimum, and sum in an array: np.max() gives the maximum value element in the array similarly np.min() gives minimum and np.sum() gives the sum of all elements in the array.
  • Finding the square root
import numpy as np
a = np.array([(2,3,4),(4,5,6)])
print(np.sqrt(a)
output
[[1.41421356 1.73205081 2. ]
[2. 2.23606798 2.44948974]]
  • Finding the standard deviation
import numpy as np
a = np.array([(2,3,4),(4,5,6)])
print(np.std(a)
output
1.2909944487358056
  • Arithmetic operations: Arithmetic operations include addition, subtraction, multiplication, and division. And we can do it using respective operators where each element from both operands operates.
  • Concatenation: There are two methods of concatenation in NumPy. One is the vertical stack method another one is a horizontal stack method. We can use np.vstack((a,b)) for vertical stacking and np.hstack((a,b)) for horizontal stacking.
  • And some NumPy special functions: NumPy also supports some special functions, they are sine, cosine, exponential and log(natural log). We can use np.sin(x), np.cosine(x), np.exp(a) and np.log(a) respectively to perform respective task.

--

--

Saroj Humagain
ml.careers

I basically write on data science, ML and AI and sometimes random things.