5 Powerful NumPy Functions a Beginner Should Know

Aditya Patkar
The Startup
Published in
5 min readNov 27, 2020

Learn exactly what they do, along with working and breaking examples

NumPy, as the name suggest, is a powerful and open source python library that helps us compute operations on primarily numbers, faster. It is an important tool for data science. NumPy lets us create multi dimensional arrays and lets us perform simple as well as complex operations like indexing, broadcasting, slicing, matrix multiplication to name a few. Today we’ll see how to organize your NumPy arrays better and compute some interesting operations using following functions. The functions are :

  • numpy.sort
  • numpy.count_nonzero
  • numpy.where
  • numpy.compress
  • numpy.trace
A powerful library every Data Science Enthusiast must know

Function 1 — np.sort

This returns a sorted copy of the array. Please note that the original array is not changed. The arguments the function normally takes are -

  1. The Array to be sorted.
  2. Axis along which to sort.
  3. The method/ algorithm used for sorting as “kind = ”.

Example 1 (Working example) :

As you can see, the values inside our array were sorted along the rows. The original array ‘a’ is unchanged and we get a sorted copy of the array in ‘a_sorted’.

Example 2(Working example) :

As we can see, sort() was useful to find median of the data. Here we provide axis as None as we are using a flat array.

Example 3 (Error) :

We return an error saying that the specified axis is out of bounds. We had given a 2 dimensional array with axis (0,1). As the third dimension doesn’t exist for the given array, we get the error. To fix this error, make sure the axis you specify is within the bounds of the dimensions of the given array.

Summary :

Use sort() function when you have a collection or an array of random unordered data but the use case requires you to sort the data in organized manner.

Function 2 — np.count_nonzero

This function can be used to count number of nonzero elements in a NumPy array. The arguments commonly taken are -

  1. Array from which the nonzero values are to be found.
  2. Axis along which number of nonzero values are to be found.

Returns the number of nonzero items in an array OR along an axis.

Example 1 (Working example):

We create an identity matrix which has 9 rows and 3 columns. We then found how many nonzero values are present in the given array.

Example 2(Working example) :

This can be a practical use case for using np.count_nonzero.

Example 3(Error) :

As we can see here, the axis provided is out of bounds. Thus we get an error. We should make sure we provide an axis that is in the range as far as dimensions of the array are concerned.

Summary :

As we saw, this is another function to organize the data or to take out important information from data. like days present for students.

Function 3 — numpy.where

This is a very interesting conditional function which lets us test a condition and do one operation on those that satisfy the condition and another on those that do not satisfy the condition. Arguments taken are -

  1. Condition to be checked.
  2. x — do x for the values that satisfy the condition.
  3. y — do y for the values that satisfy the condition.

Returns an array with x and y executed.

Example 1 (Working example):

We check for the values that are greater than 10. The values that were greater than 10, were kept as they were. The values that were lesser than 10 were multiplied by 10.

Example 2 (Working example):

Here we check for the marks which are less than 45 and mark them as failed, and mark others as passed.

Example 3 (Error):

We have provided only the x argument along with condition. This is vague and doesn’t tell what to do with values that do not match the condition. We should provide both x and y arguments or neither arguments. When only condition is provided, it acts like np.asarray(condition).nonzero().

Summary :

This is a really good function for conditional problems where there are different operations to be performed for those who satisfy a condition and for those that do not.

Function 4 — numpy.compress

This function returns a slice of values that satisfy a conditional array. This function takes following arguments -

  1. A 1D conditional array or Boolean array.
  2. Array on which the compression is done.
  3. Axis along which the compression is done.

Example 1 (Working example):

This returns a compressed slice of the original array according to the condition.

Here the items that correspond to 1 from a are selected for the new slice of the array.

Example 2 (Working example):

A Boolean array can also be used here. The values corresponding to True are selected and those corresponding to False are not selected.

Example 3 (Error):

Here the Boolean array contains more items than the array a, thus we get an error. We should’ve used only 3 values in the Boolean array as the last “True” doesn’t correspond to anything.

Summary :

This is a good function for slicing the array which in turn lets us focus on important parts of the data.

Function 5 — numpy.trace

This is a really cool function that returns sum of values along the diagonal of an array. Arguments taken are -

  1. An array.
  2. Offset which can be negative or positive.
  3. Axis. (axis 1 and axis 2 which forms a tringle with the diagonal)

Returns the sum in float for 2 dimensional Array or if more than 2 dimensions exist, it returns an array with sum along all the diagonals in the array.

Example 1 (Working example):

We created an identity matrix with 9 rows and columns and then calculated sum along the diagonal.

Example 2 (Working example):

The values along the 9 diagonals are stored in the returned array.

Example 3(Error):

The offset needs to take an integer value and not float. The offset can be negative or positive which corresponds to below the diagonal and above the diagonal respectively

Summary :

This function shows that NumPy is great tool which has cool functions for almost anything!

Conclusion

This was a good look under the hood of NumPy documentation which is quite vast and almost impossible to know 100%. We found out that there are very useful functions that can do a particular job quickly, efficiently and with less code. That’s the power of NumPy library. We saw functions for organizing data, calculations and learnt real life as well as theoretical examples of the same. We also saw the things to avoid when using these functions. There is still a lot more practice needed to get comfortable with this amazing and powerful library. That would be next along with learning about Pandas, another library I’m really excited about.

If you liked this, please do check out my article about Kickstarter projects where I do an in depth analysis of success rate of Kickstarter projects using Python, NumPy, Pandas, Matplotlib, Seaborn and a dataset containing 300,000+ Kickstarter projects.

--

--