# NumPy in 50 Cells of Notebook

*Originally posted **here**. All the codes are in a jupyter notebook and can be downloadable **here**.*

In this post I will introduce the NumPy package and show how to use some of its most common features, functions and attributes. I will describe each feature with an example.

This tutorial consists of the following parts:

- What is NumPy?
- How to create NumPy arrays
- Indexing, Fancy Indexing
- Slicing
- Universal Functions (Ufuncs)
- Broadcasting
- Masking, Sorting and Comparison
- Further learning

### 1- What is NumPy?

NumPy is the basis of Pandas and many other packages. What makes NumPy such an incredible package is its data type (ndarray). ndarray stands for n-dimensional array, which basically looks like a Python list. However, it is a lot faster than a regular Python list. A Python list can contain different kinds of data types such as integers, strings, Boolean, True, False and even lists. On the other hand, NumPy arrays can hold only one type of data, and therefore doesn’t have to check the type of data type for every single element of the array when it is doing the computations. This feature makes NumPy a great tool for data science research and projects.

Before we get started, let’s check the version of NumPy and Python.

#import NumPy

import numpy as np

# sys was imported to check the python version

import sys

# check the version of python and NumPy

print('NumPy version:', np.__version__)

print('Python version',sys.version)

>>> NumPy version: 1.12.1

>>> Python version 3.6.1 |Anaconda custom (64-bit)| (default, Mar 22 2017, 20:11:04) [MSC v.1900 64 bit (AMD64)]

### 2- How to create NumPy arrays

There are many ways to create arrays in NumPy. Let’s take a look at a few of them here.

# Create one dimensional NumPy array

np.array([1, 2, 3])

>>> array([1, 2, 3])

# Array of zeros

np.zeros(3)

>>> array([ 0., 0., 0.])

# Array of 1s

np.ones(3)

>>> array([ 1., 1., 1.])

# Array of 3 random integers between 1 and 10

np.random.randint(1,10, 3)

>>> array([4, 8, 4])

# Create linearly spaced array

np.linspace(0, 10, 5 )

>>> array([ 0. , 2.5, 5. , 7.5, 10. ])

# Create 2-dimensional array

np.array([[1,2,3],

[4,5,6],

[7,8,9]])

>>> array([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

# Create 3x4 array values between 0 and 1

np.random.random((3,4))

>>> array([[ 0.85957774, 0.90323213, 0.08000421, 0.45366519],

[ 0.15077925, 0.57901453, 0.72878536, 0.88573099],

[ 0.51431053, 0.46266243, 0.54166614, 0.72836133]])

Let’s create 1-Dimensional and 2-Dimensional arrays.

a = np.array([1,2,3])

b = np.random.randint(0,10, (3,3))

print(a)

print(b)

>>> [1 2 3]

>>> [[5 9 0]

[5 7 8]

[0 1 9]]

# Adding new values into the the array

a = np.append(a, 4)

a

>>> array([1, 2, 3, 4])

# Print the shape and dimension of arrays

print("Shape of a:", np.shape(a))

print("Shape of b:", np.shape(b))

print('Dimension of a:', np.ndim(a))

print('Dimension of b:', np.ndim(b))

>>> Shape of a: (4,)

>>> Shape of b: (3, 3)

>>> Dimension of a: 1

>>> Dimension of b: 2

# Number of elements in the arrays

print('Number of elements in a:', np.size(a))

print('Number of elements in b:', np.size(b))

>>> Number of elements in a: 4

>>> Number of elements in b: 9

### 3- Indexing and Fancy Indexing

Indexing allows us to access the elements in the list. Indexing is the simplest way to do that. However, there are other ways, too such as fancy indexing, slicing and masking.

# a is 1D array, we created before

a

>>> array([1, 2, 3, 4])

# b is 2D array, we created before

b

>>> array([[4, 7, 4],

[7, 1, 0],

[9, 8, 6]])

# Get the first element of a

# These two print statements generate the same result

print(a[0])

print(a[-4])

>>> 1

>>> 1

# Get the last element of a

# These two print statements generate the same result

print(a[-1])

print(a[3])

>>> 4

>>> 4

# Get the first row of b

# These two print statements generate the same result

print(b[0])

print(b[0,:])

>>> [1 3 8]

>>> [1 3 8]

# Get the second column of b

b[:,1]

>>> array([3, 4, 0])

Fancy indexing allows us to pick certain values in the list quickly.

# To understand the fancy indexing better we will create two new arrays.

x = np.array(['a', 'b', 'c'])

y = np.array([['d', 'e', 'f'],

['g', 'h', 'k']])

print(x)

print(y)

>>> ['a' 'b' 'c']

>>> [['d' 'e' 'f']

['g' 'h' 'k']]

# Fancy indexing on 1-D array

# Get the value of c in array x

ind = [2]

x[ind]

>>> array(['c'],

dtype='<U1')

# Fancy indexing on 2D array

# Get the values e, h in array y

ind2 = [[0,1],[1]]

y[ind2]

>>> array(['e', 'h'],

dtype='<U1')

### 4- Slicing

Slicing is the way to choose a range of values in the array. We use a colon (:) in square brackets.

This is the structure of slicing in NumPy.** [Start : Stop : Step]**

# Create an array of integers from 1 to 10

X = np.arange(1, 11, dtype=int)

X

>>> array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Get the first two elements of X

X[:2]

>>> array([1, 2])

# Get the number 3,4 and 5

X[2:5]

>>> array([3, 4, 5])

# Get odd numbers

X[::2]

>>> array([1, 3, 5, 7, 9])

# Get even numbers

X[1::2]

>>> array([ 2, 4, 6, 8, 10])

# Create 2-D array

Y= np.arange(1,10).reshape(3,3)

Y

>>> array([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

# Get the first and second row

Y[:2,:]

>>> array([[1, 2, 3],

[4, 5, 6]])

# Get the second and third column

Y[:, 1:]

>>> array([[2, 3],

[5, 6],

[8, 9]])

# Get the element of 5 and 6

Y[1,1:]

>>> array([5, 6])

### 5- Universal Functions (Ufuncs)

Universal functions are useful when it comes to doing statistical and mathematical operations in NumPy arrays. NumPy Ufuncs are significantly faster than Python because the same operation in Python might require loops.

To see the list of available Ufuncs press tab after np. For example: np.{TAB}

# Find the maximum element of X

np.max(X)

>>> 10

# Mean of values in the X

np.mean(X)

>>> 5.5

# Get the 4th power of each element in X

np.power(X, 4)

>>> array([ 1, 16, 81, 256, 625, 1296, 2401, 4096, 6561, 10000])

# Trigonometric functions

print(np.sin(X))

print(np.tan(X))

>>> [ 0.84147098 0.90929743 0.14112001 -0.7568025 -0.95892427 -0.2794155

0.6569866 0.98935825 0.41211849 -0.54402111]

>>> [ 1.55740772 -2.18503986 -0.14254654 1.15782128 -3.38051501 -0.29100619

0.87144798 -6.79971146 -0.45231566 0.64836083]

# (sinx)2 + (cosy)2 = 1 famous trigonometric equation

np.square(np.sin(X)) + np.square(np.cos(X))

>>> array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

# array Y created before

Y

array([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

# The same rules apply for 2-D arrays

np.multiply(Y, 2)

>>> array([[ 2, 4, 6],

[ 8, 10, 12],

[14, 16, 18]])

# split Y into 3 subarraysnp.split(Y, 3)

>>> [array([[1, 2, 3]]), array([[4, 5, 6]]), array([[7, 8, 9]])]

### 6- Broadcasting

Broadcasting makes it possible to use Ufuncs and many other operations on different sizes of arrays. There are some rules in order to do broadcasting. I won’t go into details here. However, I will refer a tutorial below.

# Add 5 to each element of X

X + 5

>>> array([ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])

# Or

np.add(X, 5)

>>> array([ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])

# Create new array Z

Z = np.arange(3)[:, np.newaxis]

Z

>>> array([[0],

[1],

[2]])

# Multiple Y and Z

np.multiply(Y, Z)

>>> array([[ 0, 0, 0],

[ 4, 5, 6],

[14, 16, 18]])

### 7- Masking, Comparing and Sorting

Masking is another very useful method of NumPy arrays.

# Create an array of 10 elements between 1 and 5

x = np.random.randint(1,5, 10)

x

>>>> array([3, 3, 4, 4, 4, 2, 1, 2, 3, 4])

# Create (3,3) size of array elements from 1 and 5

y = np.random.randint(1,5, (3,3))

y

>>> array([[2, 1, 4],

[4, 3, 2],

[3, 2, 4]])

# Sort elements in array x

np.sort(x)

>>> array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])

# Sort values along the rows

np.sort(y, axis=0)

>>> array([[2, 1, 2],

[3, 2, 4],

[4, 3, 4]])

# Sort values along the columns

np.sort(y, axis=1)

>>> array([[1, 2, 4],

[2, 3, 4],

[2, 3, 4]])

# == , !=, < , >, >=, <= operations on arrays

# This returns a Boolean

x > 3

>>> array([False, False, True, True, True, False, False, False, False, True], dtype=bool)

# Use masking feature to get the values of comparisons

x[x>3]

>>> array([4, 4, 4, 4])

# Another example

x[(x <= 3) & ( x > 1 )]

>>> array([3, 3, 2, 2, 3])

### 8- Further Learning

All the code in this tutorial is in my github. There is also a jupiter notebook that can be downloaded. I highly recommend to rewrite the codes and try it on your own.

I have a list of resources for NumPy in my blog which has courses, tutorials, articles, etc.