# NumPy in 50 Cells of Notebook

Originally posted here. All the codes are in a jupyter notebook and can be downloadable here.

In this post I will introduce the NumPy package and show how to use some of its most common features, functions and attributes. I will describe each feature with an example.

This tutorial consists of the following parts:

• What is NumPy?
• How to create NumPy arrays
• Indexing, Fancy Indexing
• Slicing
• Universal Functions (Ufuncs)
• Further learning

### 1- What is NumPy?

NumPy is the basis of Pandas and many other packages. What makes NumPy such an incredible package is its data type (ndarray). ndarray stands for n-dimensional array, which basically looks like a Python list. However, it is a lot faster than a regular Python list. A Python list can contain different kinds of data types such as integers, strings, Boolean, True, False and even lists. On the other hand, NumPy arrays can hold only one type of data, and therefore doesn’t have to check the type of data type for every single element of the array when it is doing the computations. This feature makes NumPy a great tool for data science research and projects.

Before we get started, let’s check the version of NumPy and Python.

`#import NumPyimport numpy as np`
`# sys was imported to check the python versionimport sys`
`# check the version of python and NumPyprint('NumPy version:', np.__version__)print('Python version',sys.version)`
`>>> NumPy version: 1.12.1>>> Python version 3.6.1 |Anaconda custom (64-bit)| (default, Mar 22 2017, 20:11:04) [MSC v.1900 64 bit (AMD64)]`

### 2- How to create NumPy arrays

There are many ways to create arrays in NumPy. Let’s take a look at a few of them here.

`# Create one dimensional NumPy arraynp.array([1, 2, 3])`
`>>> array([1, 2, 3])`
`# Array of zerosnp.zeros(3)`
`>>> array([ 0.,  0.,  0.])`
`# Array of 1snp.ones(3)`
`>>> array([ 1.,  1.,  1.])`
`# Array of 3 random integers between 1 and 10np.random.randint(1,10, 3)`
`>>> array([4, 8, 4])`
`# Create linearly spaced arraynp.linspace(0, 10, 5 )`
`>>> array([  0. ,   2.5,   5. ,   7.5,  10. ])`
`# Create 2-dimensional arraynp.array([[1,2,3],          [4,5,6],          [7,8,9]])`
`>>> array([[1, 2, 3],           [4, 5, 6],           [7, 8, 9]])`
`# Create 3x4 array values between 0 and 1np.random.random((3,4))`
`>>> array([[ 0.85957774,  0.90323213,  0.08000421,  0.45366519],           [ 0.15077925,  0.57901453,  0.72878536,  0.88573099],           [ 0.51431053,  0.46266243,  0.54166614,  0.72836133]])`

Let’s create 1-Dimensional and 2-Dimensional arrays.

`a = np.array([1,2,3])b = np.random.randint(0,10, (3,3))`
`print(a)print(b)`
`>>> [1 2 3]>>> [[5 9 0]     [5 7 8]     [0 1 9]]`
`# Adding new values into the the arraya = np.append(a, 4)a`
`>>> array([1, 2, 3, 4])`
`# Print the shape and dimension of arraysprint("Shape of a:", np.shape(a))print("Shape of b:", np.shape(b))`
`print('Dimension of a:', np.ndim(a))print('Dimension of b:', np.ndim(b))`
`>>> Shape of a: (4,)>>> Shape of b: (3, 3)>>> Dimension of a: 1>>> Dimension of b: 2`
`# Number of elements in the arraysprint('Number of elements in a:', np.size(a))print('Number of elements in b:', np.size(b))`
`>>> Number of elements in a: 4>>> Number of elements in b: 9`

### 3- Indexing and Fancy Indexing

Indexing allows us to access the elements in the list. Indexing is the simplest way to do that. However, there are other ways, too such as fancy indexing, slicing and masking.

`# a is 1D array, we created beforea`
`>>> array([1, 2, 3, 4])`
`# b is 2D array, we created beforeb`
`>>> array([[4, 7, 4],           [7, 1, 0],           [9, 8, 6]])`
`# Get the first element of a # These two print statements generate the same resultprint(a[0])print(a[-4])`
`>>> 1>>> 1`
`# Get the last element of a # These two print statements generate the same resultprint(a[-1]) print(a[3])`
`>>> 4>>> 4`
`# Get the first row of b# These two print statements generate the same resultprint(b[0]) print(b[0,:])`
`>>> [1 3 8]>>> [1 3 8]`
`# Get the second column of bb[:,1]`
`>>> array([3, 4, 0])`

Fancy indexing allows us to pick certain values in the list quickly.

`# To understand the fancy indexing better we will create two new arrays. x = np.array(['a', 'b', 'c'])y = np.array([['d', 'e', 'f'],               ['g', 'h', 'k']])`
`print(x)print(y)`
`>>> ['a' 'b' 'c']>>> [['d' 'e' 'f']     ['g' 'h' 'k']]`
`# Fancy indexing on 1-D array# Get the value of c in array xind = [2]x[ind]`
`>>> array(['c'],       dtype='<U1')`
`# Fancy indexing on 2D array# Get the values e, h in array yind2 = [[0,1],[1]]y[ind2]`
`>>> array(['e', 'h'],       dtype='<U1')`

### 4- Slicing

Slicing is the way to choose a range of values in the array. We use a colon (:) in square brackets.

This is the structure of slicing in NumPy. [Start : Stop : Step]

`# Create an array of integers from 1 to 10X = np.arange(1, 11, dtype=int)X`
`>>> array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])`
`# Get the first two elements of X X[:2]`
`>>> array([1, 2])`
`# Get the number 3,4 and 5 X[2:5]`
`>>> array([3, 4, 5])`
`# Get odd numbers X[::2]`
`>>> array([1, 3, 5, 7, 9])`
`# Get even numbersX[1::2]`
`>>> array([ 2,  4,  6,  8, 10])`
`# Create 2-D array Y= np.arange(1,10).reshape(3,3)Y`
`>>> array([[1, 2, 3],           [4, 5, 6],           [7, 8, 9]])`
`# Get the first and second rowY[:2,:]`
`>>> array([[1, 2, 3],           [4, 5, 6]])`
`# Get the second and third columnY[:, 1:]`
`>>> array([[2, 3],           [5, 6],           [8, 9]])`
`# Get the element of 5 and 6Y[1,1:]`
`>>> array([5, 6])`

### 5- Universal Functions (Ufuncs)

Universal functions are useful when it comes to doing statistical and mathematical operations in NumPy arrays. NumPy Ufuncs are significantly faster than Python because the same operation in Python might require loops.

To see the list of available Ufuncs press tab after np. For example: np.{TAB}

`# Find the maximum element of Xnp.max(X)`
`>>> 10`
`# Mean of values in the Xnp.mean(X)`
`>>> 5.5`
`# Get the 4th power of each element in Xnp.power(X, 4)`
`>>> array([    1,    16,    81,   256,   625,  1296,  2401,  4096,  6561, 10000])`
`# Trigonometric functions print(np.sin(X))print(np.tan(X))`
`>>> [ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427 -0.2794155  0.6569866   0.98935825  0.41211849 -0.54402111]>>> [ 1.55740772 -2.18503986 -0.14254654  1.15782128 -3.38051501 -0.29100619  0.87144798 -6.79971146 -0.45231566  0.64836083]`
`# (sinx)2 + (cosy)2 = 1 famous trigonometric equationnp.square(np.sin(X)) + np.square(np.cos(X))`
`>>> array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])`
`# array Y created beforeY`
`array([[1, 2, 3],       [4, 5, 6],       [7, 8, 9]])`
`# The same rules apply for 2-D arraysnp.multiply(Y, 2)`
`>>> array([[ 2,  4,  6],           [ 8, 10, 12],           [14, 16, 18]])`
`# split Y into 3 subarraysnp.split(Y, 3)`
`>>> [array([[1, 2, 3]]), array([[4, 5, 6]]), array([[7, 8, 9]])]`

Broadcasting makes it possible to use Ufuncs and many other operations on different sizes of arrays. There are some rules in order to do broadcasting. I won’t go into details here. However, I will refer a tutorial below.

`# Add 5 to each element of XX + 5`
`>>> array([ 6,  7,  8,  9, 10, 11, 12, 13, 14, 15])`
`# Or np.add(X, 5)`
`>>> array([ 6,  7,  8,  9, 10, 11, 12, 13, 14, 15])`
`# Create new array Z Z = np.arange(3)[:, np.newaxis]Z`
`>>> array([[0],           [1],           [2]])`
`# Multiple Y and Znp.multiply(Y, Z)`
`>>> array([[ 0,  0,  0],           [ 4,  5,  6],           [14, 16, 18]])`

### 7- Masking, Comparing and Sorting

Masking is another very useful method of NumPy arrays.

`# Create an array of 10 elements between 1 and 5x = np.random.randint(1,5, 10)x`
`>>>> array([3, 3, 4, 4, 4, 2, 1, 2, 3, 4])`
`# Create (3,3) size of array elements from 1 and 5y = np.random.randint(1,5, (3,3))y`
`>>> array([[2, 1, 4],           [4, 3, 2],           [3, 2, 4]])`
`# Sort elements in array xnp.sort(x)`
`>>> array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])`
`# Sort values along the rowsnp.sort(y, axis=0)`
`>>> array([[2, 1, 2],           [3, 2, 4],           [4, 3, 4]])`
`# Sort values along the columnsnp.sort(y, axis=1)`
`>>> array([[1, 2, 4],           [2, 3, 4],           [2, 3, 4]])`
`# == , !=, < , >, >=, <= operations on arrays# This returns a Booleanx > 3`
`>>> array([False, False,  True,  True,  True, False, False, False, False,  True], dtype=bool)`
`# Use masking feature to get the values of comparisonsx[x>3]`
`>>> array([4, 4, 4, 4])`
`# Another example x[(x <= 3) & ( x > 1 )]`
`>>> array([3, 3, 2, 2, 3])`

### 8- Further Learning

All the code in this tutorial is in my github. There is also a jupiter notebook that can be downloaded. I highly recommend to rewrite the codes and try it on your own.

I have a list of resources for NumPy in my blog which has courses, tutorials, articles, etc.