A Complete Guide on NumPy — Part 2
In this article, I will talk about the second halves of the NumPy. You can find the first half of NumPy at this link (A Complete Guide on NumPy — Part 1) where I talk about the origin of NumPy, Tensors, Creating Arrays and Manipulating Arrays.
⚠️ All the images were drawn and all the code was written by me. If you want to use it, please refer me to the source.
📖 I will cover these routines to better understand NumPy.
- Benefits of NumPy (in Part 1)
- What is a Tensor? (in Part 1)
- Key Attributes of Tensors
- Tensor Types - Creating NumPy Arrays (in Part 1)
- Array Manipulation (in Part 1)
- Mathematical Functions
- Linear Algebra
- Sorting and Searching
- Indexing and Slicing
- Statistic
- Input Output
Mathematical Functions
1. np.add(), np.subtract(), np.divide(), np.multiply()
x = np.array([[1, 87],
[54, 6]])
y = np.array([[25, 34],
[3, 87]])
print(f'''Add : {np.add(x, y)}
Subtract: {np.add(x, y)}
Divide : {np.divide(x, y)}
Multiply: {np.multiply(x, y)}''')
OUT:
Add : [[ 26 121]
[ 57 93]]
Subtract : [[ 26 121]
[ 57 93]]
Divide : [[ 0.04 2.55882353]
[18. 0.06896552]]
Multiply : [[ 25 2958]
[ 162 522]]
2. np.exp()
The np.exp() calculates the exponential (e = Euler ≈ 2.72) value of the given input.
x = np.array([1, 2, 3, 4])
np.exp(x)
OUT:
array([ 2.71828183, 7.3890561 , 20.08553692, 54.59815003])
3. np.power()
This function calculates the first parameter (x1) based on the power in the second parameter (x2).
x = np.array([1, 2, 3, 4])
np.power(x, 3)
OUT:
array([ 1, 8, 27, 64])
4. np.square(), np.sqrt()
The np.square() calculates the square of an array. However the np.sqrt() calculates the positive square-root of an array.
x = np.array([1, 2, 3, 4])
np.square(x)
OUT:
array([ 1, 4, 9, 16])
x = np.array([ 1, 4, 9, 16])
np.sqrt(x)
OUT:
array([1., 2., 3., 4.])
Linear Algebra
NumPy has several Linear Algebra methods for use with NumPy arrays.
i. np.dot()
This Linear Algebra function calculates the dot product of the given arrays.
x1 = np.array([1, 2, 3, 4])
x2 = np.array([ 1, 4, 9, 16])
np.dot(x1, x2)
OUT:
100
ii. np.matmul()
The np.matmul() calculates the matrix multiplication of two arrays.
⚠️ The column of the first matrix must be equal to the row properties of the second matrix.
x1 = np.array([4,5,6])
x2 = np.array([7,8,9])
np.matmul(x1, x2)
OUT:
122
iii. np.linalg.inv()
The inverse function calculates the multiplicative inverse of the input matrix.
x = np.array([[5, 8, 9],
[6, 3, 7],
[1, 7, 5]])
np.linalg.inv(x)
OUT:
array([[ 11.33333333, -7.66666667, -9.66666667],
[ 7.66666667, -5.33333333, -6.33333333],
[-13. , 9. , 11. ]])
Sorting and Searching
NumPy arrays are sortable and searchable just like Python Lists.
a. np.sort()
This function sorts arrays along the first axis (axis=0) and the last axis (axis=-1, also default is -1). If axis is “None”, np.sort() returns a sorted flattened array.
The “kind” parameter of the np.sort() function allows us to choose a sorting algorithm type. These are:
- quicksort (default)
- mergesort
- heapsort
- stable
x = np.array([[1, -4, 7],
[-2, 5, -8],
[3, -6, 9]])
np.sort(x)
OUT:
array([[-4, 1, 7],
[-8, -2, 5],
[-6, 3, 9]])
x = np.array(["Google", "Apple", "Facebook"])
np.sort(x)
OUT:
array(['Apple', 'Facebook', 'Google'], dtype='<U8')
b. np.argmax()
The np.argmax() function returns the indices of the maximum value of the array. If the axis is “None” which is the default, a flattened array is returned.
If maximum value repeats more than once then the maximum value in first place indice will be return.
x = np.array([0.1, 0.98, 0.85, 0.76, 0.01, 0.6, 0.34, 0.68])
np.argmax(x)
OUT:
1
x = np.array([10, 68, 97, 10, 12, 97, 65, 3, 97])
print(np.argmax(x))
print("Max value is" , x[np.argmax(x)])
OUT:
2
Max value is 97
Indexing and Slicing
There are several ways to index (select) subsets of a NumPy arrays. For example;
- Comma
- Colons
- array[start_row:end_row, start_col:end_col]
- array[start:stop:step] - Negative Indexing
- It starts with -1 not 0 and -1 means last row/column. - Boolean Indexing
- …
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]])
print("Simple Indexing : ", x[3])
print("Negative Indexing : ", x[-1])
OUT:
Simple Indexing : [13 14 15 16]
Negative Indexing : [13 14 15 16]
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]])
x[2,2]
OUT:
11
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]])
x[0:2, 2:]
OUT:
array([[3, 4],
[7, 8]])
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]])
x[0:2, [0, 2]]
OUT:
array([[1, 3],
[5, 7]])
# Colon Indexing
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]])
print(f'''Colon Indexing: {x[0:3]}
\nColon Indexing with step: {x[0:3:2]}
''')
OUT:
Colon Indexing : [[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
Colon Indexing with step: [[ 1 2 3 4]
[ 9 10 11 12]]
# Boolean Indexing
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]])
print(x >= 11)
OUT:
[[False False False False]
[False False False False]
[False False True True]
[ True True True True]]
We cannot use “and, or” statements for multiple boolean conditions as in Pandas. Instead, we use the “&, |” operators, respectively.
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print("And: ", ((x < 7) & (x > 3)))
print("\nOr : ", ((x < 2) | (x > 8)))
OUT:
And: [False False False True True True False False False False]
Or : [ True False False False False False False False True True]
Statistic
NumPy also has diverse statistical function. Some of them:
A. np.median(), np.mean(), np.std(), np.var()
x = np.array([[[21, 31, 89], [67, 23, 89], [31, 22, 88]],
[[120, 805, 190], [78, 993, 56], [64, 89, 91]]])
print(f'''Median : {np.median(x)}
Mean : {np.mean(x)}
Standard Deviation: {np.std(x)}
Variance : {np.var(x)}
''')
OUT:
Median : 83.0
Mean : 163.72222222222223
Standard Deviation: 264.8955361302421
Variance : 70169.6450617284
Input Output
We can save and load the NumPy arrays as binary files (.npy, .npz) or text/tabular files.
I. np.save(), np.savez(), np.savetxt()
The save() function saves a single array as .npy format and the savez() function saves multiple arrays as .npz format. However savetxt() function saves 1D or 2D NumPy arrays in text files.
x = np.array([[[10, 11, 12], [13, 14, 15]],
[[20, 21, 22], [23, 24, 25]],
[[30, 31, 32], [33, 34, 35]]])
np.save("test1", x)
np.savez("test2", x=x, y=x.T)
np.savetxt("test3.txt", x.reshape(2, 9))
import os
files = [f for f in os.listdir('.') if os.path.isfile(f)]
for f in files:
print(f)
OUT:
test2.npz
test3.txt
test1.npy
II. np.load(), np.loadtxt()
The np.load() function loads NumPy arrays ending in .npy or .npz. The np.loadtxt() fıunction load the data from text file.
x = np.load("test1.npy")
x
OUT:
array([[[10, 11, 12],
[13, 14, 15]],
[[20, 21, 22],
[23, 24, 25]],
[[30, 31, 32],
[33, 34, 35]]])
x = np.loadtxt("test3.txt")
x
OUT:
array([[10., 11., 12., 13., 14., 15., 20., 21., 22.],
[23., 24., 25., 30., 31., 32., 33., 34., 35.]])
Conclusion
I have tried to cover NumPy in this blog both theoretically and practical examples. You may also want to read the “A Complete Guide to NumPy — Part 1” to better understand NumPy.
Thank you for reading !!!