Practical NumPy

Ja X
9 min readDec 17, 2023

--

What is NumPy?

NumPy is a Python library that helps you work with arrays (lists of numbers) in a smart and efficient way. It provides tools for performing mathematical operations, making it easier to work with data in tasks like calculations and analysis.

The first step of using NumPy is to tell python to import it.

# if not installed, the first step is to install it from TERMINAL
pip install numpy
# then you can import it
import numpy as np

1. Creating NumPy Arrays

An array is a data structure that stores values of the same data type. While Python lists can accommodate values of different data types, arrays in Python are designed to hold values of a uniform data type. However, when dealing with large sets of numerical data, Python lists may not provide the performance required. To overcome this limitation, we utilise NumPy array.

1.1 Defining NumPy arrays with np.array(list)

# defining a list of string elements
list_string = ['Audi', 'Tesla', 'Toyota', 'Mercedes', 'Ford']

# defining a list of numbers
list_number = [5, 4, 6, 7, 3]
# connverting the list list_string to a NumPy array
arr_str = np.array(list_string)

# connverting the list list_number to a NumPy array
arr_num = np.array(list_number)

1.2 Creating NumPy Matrix with np.array([[],[],[]])

A matrix is a two-dimensional data structure where elements are arranged into rows and columns. A matrix can be created by using a list of lists.

# we simply define lists in np.array function
matrix = np.array([[1,2,1],[4,5,9],[1,8,9]])
print(matrix) #which will return 3x3 matrix
[[1 2 1]
[4 5 9]
[1 8 9]]

1.3 Different ways to create NumPy arrays using the functions available in NumPy library

The np.arange() function yields an array with evenly spaced elements within the specified interval. The interval is half-open, meaning that the starting value is encompassed, while the stopping value is not. This function possesses the subsequent parameters:

  • start: the initiation of the interval range, with a default value of 0.
  • stop: the conclusion of the interval range.
  • step: the magnitude of the interval's steps, with a default step size of 1.
arr_1  = np.arange(start = 0, stop = 10) 
# which will create [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# or without mentioning start & stop keywords
arr_1 = np.arange(0,10)


# adding a step size
arr_step = np.arange(start = 0, stop = 20, step = 5)
# which will create [0, 5, 10, 15]

Utilizing the np.linspace() function provides numbers evenly distributed within a given interval. In this case, both the starting and stopping values are included. This function encompasses the subsequent parameters:

  • start: the initiation of the interval range, with a default value of 0.
  • stop: the conclusion of the interval range.
  • num: the number of samples to generate, with a default of 50.
matrix_even = np.linspace(0,5) 
# by default 50 evenly spaced values will be generated between 0 and 5

# OUT
array([0. , 0.10204082, 0.20408163, 0.30612245, 0.40816327,
0.51020408, 0.6122449 , 0.71428571, 0.81632653, 0.91836735,
1.02040816, 1.12244898, 1.2244898 , 1.32653061, 1.42857143,
1.53061224, 1.63265306, 1.73469388, 1.83673469, 1.93877551,
2.04081633, 2.14285714, 2.24489796, 2.34693878, 2.44897959,
2.55102041, 2.65306122, 2.75510204, 2.85714286, 2.95918367,
3.06122449, 3.16326531, 3.26530612, 3.36734694, 3.46938776,
3.57142857, 3.67346939, 3.7755102 , 3.87755102, 3.97959184,
4.08163265, 4.18367347, 4.28571429, 4.3877551 , 4.48979592,
4.59183673, 4.69387755, 4.79591837, 4.89795918, 5. ])

# 0: Start and 5: Stop included

How do these values come into existence? The increment size or the disparity between each element will be determined by the ensuing formula:

step=(stop-start)/(num-1)

step_example=(5-1)/(50-1)=0.10204082

1.4 Similarly we can create matrices using the functions available in NumPy library

How does the np.zeros() & np.ones()function come into play? The functions are employed to construct a matrix and execute matrix operations in NumPy. It yields a matrix filled with zeros and ones, and it considers the ensuing parameters:

  • shape: the count of rows and columns in the resultant matrix.
  • dtpye: the data type of the elements within the matrix, with the default value automatically set to float.
matrix_zeros = np.zeros([3,5])
#OUT
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])

matrix_ones = np.ones([3,5])
#OUT
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])

The np.eye() function serves to generate a matrix and conduct matrix operations in NumPy. The outcome is a matrix featuring ones along the diagonal and zeros elsewhere.

matrix_eye = np.eye(5) # will create 5x5 matrix ones in diagonal
#OUT
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])

2. NumPy Functions

The shape of an array essentially conveys the count of elements and dimensions within the array. Reshaping a NumPy array entails altering the arrangement of the array. By reshaping an array, we gain the ability to introduce or eliminate dimensions and modify the number of elements in each dimension.

2.1 To achieve array reshaping in NumPy, we employ the reshape method with the provided array, adhering to the syntax: array.reshape(shape). Here, the shape parameter is a tuple provided as input, where the values in the tuple determine the new configuration of the array.

# defining an array with values 0 to 9
arr_10 = np.arange(0,10)

arr_re10 = arr_10.reshape([10,1])

#OUT
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])

2.2 NumPy extends its functionality to encompass an extensive array of mathematical operations, offering various dedicated functions for diverse tasks. Some of the operations facilitated by NumPy include:

  • Trigonometric functions: np.sin(), np.cos(), np.tan()
print('Sine Function:',np.sin(4))
print('Cosine Function:',np.cos(4))
print('Tan Function',np.tan(4))
#out
Sine Function: -0.7568024953079282
Cosine Function: -0.6536436208636119
Tan Function 1.1578212823495775
  • Exponents and Logarithmic functions: np.log() & np.exp()
arr_exp = np.array([2,4,6])
np.exp(arr_exp)

# by default NumPy takes the base of log as e
np.log(2)
# log with base 10
np.log10(8)
array([  7.3890561 ,  54.59815003, 403.42879349])
---------------------------------------------------------------------------
0.6931471805599453
0.9030899869919435
  • Functions tailored for arithmetic operations between arrays and matrices
# we can +-*/ arrays together
# defining two arrays
arr_arc1 = np.arange(1,6)# -->[1, 2, 3, 4, 5]
arr_arc2 = np.arange(3,8)# -->[3, 4, 5, 6, 7]

print('Addition: ',arr_arc1+arr_arc2)
print('Subtraction: ',arr_arc2-arr_arc1)
print('Multiplication:' , arr_arc1*arr_arc2)
print('Division:', arr_arc1/arr_arc2)
print('Inverse:', 1/arr_arc1)
print('Powers:', arr_arc1**arr_arc2) # in python, powers are achieved using **, NOT ^!!! ^ does something completely different!
Addition:  [ 4  6  8 10 12]
Subtraction: [2 2 2 2 2]
Multiplication: [ 3 8 15 24 35]
Division: [0.33333333 0.5 0.6 0.66666667 0.71428571]
Inverse: [1. 0.5 0.33333333 0.25 0.2 ]
Powers: [ 1 16 243 4096 78125]

2.3 Operations on Matrices: Addition, Subtraction, Division,

Multiplication: * → element-wise, @ → dot product

# defining two matrices
matrix_op1 = np.arange(1,10).reshape(3,3)
matrix_op2 = np.eye(3)

print('Addition: \n', matrix_op1+matrix_op2)
print('Subtraction: \n ', matrix_op1-matrix_op2)
print('Multiplication: \n', matrix_op1*matrix_op2) # elementwise multiplication
print('Division: \n', matrix_op1/matrix_op2)
print('Multiplication: \n', matrix_op1 @ matrix_op2) # dot product
Addition: 
[[ 2. 2. 3.]
[ 4. 6. 6.]
[ 7. 8. 10.]]
Subtraction:
[[0. 2. 3.]
[4. 4. 6.]
[7. 8. 8.]]
Multiplication:
[[1. 0. 0.]
[0. 5. 0.]
[0. 0. 9.]]
Division:
[[ 1. inf inf]
[inf 5. inf]
[inf inf 9.]]
Multiplication:
[[1. 2. 3.]
[4. 5. 6.]
[7. 8. 9.]]
<ipython-input-10-09fcc0462f7f>:4: RuntimeWarning: divide by zero encountered in true_divide
print('Division: \n', matrix7/matrix8)

Transpose of a matrix: np.transpose() & matrix.T

matrix_tr=np.arange(1,10).reshape(3,3) 
# --> [[1, 2, 3],[4, 5, 6],[7, 8, 9]] 3x3 matrix
np.transpose(matrix_tr)
# another way of taking a transpose matrix_tr.T
array([[1, 4, 7],
[2, 5, 8],
[3, 6, 9]])

Functions to find min & max values: np.min() & np.max()

print('Minimum value: ',np.min(matrix_tr))
print('Maximum value: ',np.max(matrix_tr))
Minimum value:  1
Maximum value: 9

2.4 By employing the np.random.rand(), np.random.randn(), np.random.randint() functions, we can create a random NumPy arrays.

The np.random.rand() returns a random NumPy array whose element(s) are drawn randomly from the uniform distribution over [0,1].

rand_mat = np.random.rand(5,5) # uniform random variable
[[0.12063588 0.68031218 0.42592134 0.40614188 0.4570759 ]
[0.48997867 0.02714887 0.74045474 0.95157556 0.7136172 ]
[0.26598623 0.01139987 0.98883076 0.04344252 0.28805938]
[0.85931833 0.58980712 0.60560621 0.4768394 0.65345593]
[0.103575 0.06878198 0.65133941 0.85464114 0.67654822]]

The np.random.randn() returns a random NumPy array whose sample(s) are drawn randomly from the standard normal distribution (np.mean() as 0 and np.std() as 1).

rand_mat_2 = np.random.randn(500,500) 
# 500x500 matrix with randomly generated values
# with the mean of 0 and standard deviation of 1
[[-0.22879696 -0.13240114 -0.53843843 ... -1.33360506  1.25969348
0.36599144]
[ 0.16947813 -0.88735285 -0.50307878 ... 2.08665558 0.55770563
1.17782884]
[ 0.10176379 0.65248014 -0.72947478 ... 1.04553077 -1.18763607
1.17649933]
...
[-0.78185318 -0.82361677 -0.11692107 ... 0.83967156 -0.69478517
0.5027159 ]
[ 1.63081401 0.36600112 -0.88259662 ... 1.8726322 0.8693838
0.46626896]
[ 0.79361904 1.64996816 -0.85906287 ... -0.21340331 1.21179073
0.76308993]]
# Let's check the mean and standard deviation of rand_mat_2
print('Mean:',np.mean(rand_mat_2))
print('Standard Deviation:',np.std(rand_mat_2))
Mean: 0.00146278332980865
Standard Deviation: 1.0002637494266116

# We observe that the mean is very close to 0 and standard deviation is very close to 1.

The np.random.randint() returns a random numpy array whose element(s) are drawn randomly from low (inclusive) to the high (exclusive) range with given size i.e 10.

# Generating random values in an array
rand_mat3 = np.random.randint(1,5,10)
[4 2 1 2 4 4 2 4 3 1]

3. Accessing the entries of a Numpy Array

Accessing NumPy arrays involves retrieving specific elements or slices of the array for manipulation or analysis. This process enables users to interact with individual elements, extract subsets, and perform various operations on the array data efficiently.

3.1 Accessing 1D arrays

Accessing one element of an array: array[]

array_ac = np.array([3, 2, 1, 7, 4])

# accessing the 4 th entry of rand_arr
print(array_ac[3]) # --> 7

Accessing multiple elements from an array: array[x:y]

print(array_ac[2:4]) # index 4 is exclusive
[1, 7] 

We can also access multiple non-consecutive entries: array[[index1, index2, index3...indexn]]

print(array_ac[[0, 3, 4]]) # accessing index 0, 3 & 4
[3, 7, 4]

Accessing arrays with logical operators

# accessing all the values of array_ac ->[3, 2, 1, 7, 4] which are greater than 2 
print('Values greater than 2: ',rand_arr[rand_arr>0])
# array_ac>2 returns [True, Flase, False, True, True]
# following code will print True values
 [3, 7, 4]

3.2 Accessing the entries of a Matrix: matrix[], matrix[x:y, a:b]

# let's generate an array with 10 random values
rand_mat = np.random.randn(5,5)
[[-1.59931688 -2.52926564  1.78215135 -0.95267563  0.56794215]
[ 0.65166576 1.25902454 -1.81236327 -0.65665584 0.49652784]
[ 0.37523745 0.08434494 -0.00866272 1.10613862 -0.57719509]
[-1.37111873 -0.09441499 0.61385843 0.65905847 0.16490901]
[ 0.28394456 0.95233247 1.06297449 -0.31275799 -0.44637039]]
rand_mat[1] # accessing the second row of the matrix
[ 0.65166576  1.25902454 -1.81236327 -0.65665584  0.49652784]

We can access to the specific element by indicating the row and column index.

# acessing third element of the second row
print(rand_mat[1][2])
#or
print(rand_mat[1,2])
-1.81236327

Accessing multiple elements of matrix

# accessing first two rows with second and third column
print(rand_mat[0:2,1:3])
[[-2.52926564  1.78215135 ]
[1.25902454 -1.81236327 ]]

Note: We can utilise a logical operator to access elements in matrices as well. Please refer to section 3.1.

4. Modifying entries of arrays

Modifying entries in NumPy arrays allows users to change individual values within the array or matrix, providing flexibility in adapting the data to different needs or updating information

rand_mat[0:2,1:3] = [[1, 2],[3, 4]]
[[-1.59931688  1.         2.          -0.95267563  0.56794215]
[ 0.65166576 3. 4. -0.65665584 0.49652784]
[ 0.37523745 0.08434494 -0.00866272 1.10613862 -0.57719509]
[-1.37111873 -0.09441499 0.61385843 0.65905847 0.16490901]
[ 0.28394456 0.95233247 1.06297449 -0.31275799 -0.44637039]]

By utilising accessing method it is possible to update any value of an array.

# normalising all negtaive values to the 0 of the original matrix
rand_mat[rand_mat<0]=0
[[ 0           1           2           0           0.56794215]
[ 0.65166576 3 4 0 0.49652784]
[ 0.37523745 0.08434494 0 1.10613862 0 ]
[ 0 0 0.61385843 0.65905847 0.16490901]
[ 0.28394456 0.95233247 1.06297449 0 0 ]]

It’s important to note that modifying a sub-matrix derived from the original will automatically update the corresponding values in the original matrix.

sub_mat=rand_mat[0:2,0:3]
print(sub_mat)

sub_mat=-1

print(rand_mat)
[[ 0           1           2 ]
[ 0.65166576 3 4 ]]
----------------------------------------------------------------

[[-1 -1 -1 0 0.56794215]
[-1 -1 -1 0 0.49652784]
[ 0.37523745 0.08434494 0 1.10613862 0 ]
[ 0 0 0.61385843 0.65905847 0.16490901]
[ 0.28394456 0.95233247 1.06297449 0 0 ]]

While direct modifications may not have been applied to the original matrix, the values will still be updated in accordance with changes made to the sub-matrix.

To prevent this behaviour we need to use the .copy() method when we assign sub_mat

sub_mat = rand_mat[0:2,0:3].copy()
# after this method applied any change in sub matrix will not affect original matrix

4.1 Saving and Loading NumPy arrays: np.save(), np.savez() np.load(), np.savetxt(), np.loadtxt()

# it saves the array as .npy format
np.save('/path/to/file/saved_file_name',np_array)

# we can save multiple arrays / matrixes using savez
np.savez('/path/to/file/saved_file_name',np_array, np_array1)

# to load arrays
Loaded_arr = np.load('/path/to/file/saved_file_name.npy')

Loaded_multi = np.load('/path/to/file/saved_file_name.npz')

# after loading multi files we need to call specific array
loaded_multi['np_array1']

# we can also save/load text files...but only single variables
np.savetxt('/path/to/file/text_file_name.txt',randint_matrix1,delimiter=',')
rand_mat_txt = np.loadtxt('/path/to/file/text_file_name.txt',delimiter=',')

Key Functions →

Creating Arrays: np.array(list), np.arange(start, stop, step), np.linspace(start, stop, num), np.zeros() & np.ones(), np.eye(), np.random.rand(), np.random.randn(), np.random.randint()

NumPy Functions: array.reshape(shape), np.sin(), np.cos(), np.tan(), np.log() & np.exp(), np.transpose() & matrix.T, np.min() & np.max(), np.mean(), np.std()

Accessing Elements: array[x:y], array[[index1, index2 ...indexn]], matrix[], matrix[x:y, a:b]

Modifying Elements: array[x:y]=[1,2], matrix[x:y, a:b]=[[1, 2],[3, 4]], matrix[matrix<0]=0

Saving & Loading: np.save(), np.savez() np.load(), np.savetxt(), np.loadtxt()

--

--