Common placeholders in numpy arrays

Arpit Omprakash
Byte-Sized-Code
Published in
6 min readApr 5, 2020

The underlying data block for numpy is the n-dimensional array. Often, the elements of an array are not known (especially in data science), and we need some kind of placeholders for the values in the arrays. Numpy is an indispensable library for anyone performing data manipulation or calculations. But some of the features of this powerful library may not be that straightforward for beginners. Here we take a look at the different placeholder functions that numpy offers and analyze them in-depth.

Creating an array in numpy

Before all else, let’s create an array in numpy that can be manipulated. Numpy offers many ways to create an array, including converting existing lists and tuples to arrays. Although not necessary, it is generally best practice to define the data type that you want in the ‘dtype’ argument. Numpy automatically extracts the datatype from the given data and upcasts them if required. That is, if you provide a mix of integers and floats, it converts the integers into floats too. The following is the simplest way to make a new numpy array from scratch:

import numpy as np>>> np.array([1,2,3], dtype=float)
array([1. ,2. ,3.])
# Upcasting
# If one value is a float and others int, converts everything to float
>>> np.array([1,2,3.0])
array([1. ,2. ,3.])

The shape of a numpy array (sometimes referred to as their size) defines the number of rows and columns (and other dimensions) in a numpy array. As shown in the figures below, a 1-dimensional array has only one axis, and its shape is given by (n), where n is the number of elements in the axis. A 2-dimensional array has two axes, and the shape is provided by the form (m,n) where m denotes the number of rows and n denotes the number of columns. The extra dimension in a 3-dimensional array defines its depth and so on.

The dimensions of a numpy array

Now that we know how to create a simple array, we move on to placeholders.

Placeholders

Except when making tutorials (or giving examples), we seldom know what is in an array. Many instances of initiation of an array use some numbers as the placeholders for the real unknown values. It comes in play more often in data science (in my experience), where one has to initially randomly assign values (or sometimes zeros and ones) for weights or bias in a given numpy array that is later modified by the algorithm while training. Numpy offers about six types of placeholders that one can use while creating a new array. They span six functions and some additional functions that are discussed below.

np.zeros

As the name suggests, the np.zeros function fills the whole array with zeros. A simple example is given below:

>>> np.zeros(5)
array([0. ,0. ,0. ,0. ,0.])
>>> np.zeros((2,3), dtype=int)
array([[0, 0, 0],
[0, 0, 0]])

The zeros function accepts shapes as the first parameter and dtype as an optional parameter. The dtype by default is float, and thus in the first example, all the zeros are floats.

np.ones

The np.ones function fills the whole array with ones. Here also, the default data type is float. Like the previous function, we provide the shape of the output array as the first argument to the function.

>>> np.ones((5,), dtype=int)
array([1, 1, 1, 1, 1])
>>> np.ones((2, 1))
array([[1.],
[1.]])

np.empty

Now that I think of it, all the functions are perfectly named in python. (My brain to me: That’s what happens in any language you dumbass.) So, the empty function effectively produces an empty array. As opposed to np.zeros, here we do not get a zero value array, but the placeholders are uninitialized. Thus, one has to enter the values manually later if one chooses to use this function. As the values cannot be blanks, the values are tiny numbers that are not recognized as entries for the array. The array values are not initialized; thus, it is marginally faster than other methods where the values are initialized. The first argument, as always, is the shape of the output array.

>>> np.empty([2, 2])
array([[ -9.74499359e+001, 6.69583040e-309],
[ 2.13182611e-314, 3.06959433e-309]]) #uninitialized

np.full

The np.full function structure is a bit different from the others until now. Along with the shape and datatype, it also takes another argument called ‘fill_value.’ It returns an array of the given shape filled with the fill_value. The fill_value needs to be a scalar (a simple number).

>>> np.full((2, 2), fill_value=np.inf)
array([[inf, inf],
[inf, inf]])
>>> np.full((2, 2), 10)
array([[10, 10],
[10, 10]])

The like functions

Just something to get your attention

Instead of providing the shape of the output array, one can provide a given array to the respective ‘like’ functions and obtain the results as before. The above four functions have corresponding ‘like’ functions named np.zeros_like, np.ones_like, np.empty_like, and np.full_like. The following example makes things clearer.

>>> x = np.array([[1,2,3],[12,3,4]], dtype=float)
>>> x.shape
(2, 3)
>>> np.zeros_like(x)
array([[0., 0., 0.],
[0., 0., 0.]])
>>> np.ones_like(x)
array([[1., 1., 1.],
[1., 1., 1.]])
>>> np.empty_like(x)
array([[1., 1., 1.],
[1., 1., 1.]])
>>> np.full_like(x, fill_value=5)
array([[5., 5., 5.],
[5., 5., 5.]])

np.eye

Okay, the terminology is not that good at times, I guess. The np.eye function produces a diagonal matrix. It returns a 2-D array with 1’s on the diagonals, and 0’s everywhere else. Unlike the previous functions that take a tuple or list for shape, the np.eye function takes two individual parameters for rows and columns. It also takes a third argument called ‘k’ which denotes the diagonal that should be filled with ones. A value of 0 fills the main diagonal, a positive value fills the upper diagonal, and a negative value fills the lower diagonal. An example will make things clearer.

>>> np.eye(2, dtype=int)
array([[1, 0],
[0, 1]])
>>> np.eye(3, k=1)
array([[0., 1., 0.],
[0., 0., 1.],
[0., 0., 0.]])
>>> np.eye(2,3, k=0)
array([[1., 0., 0.],
[0., 1., 0.]])

np.random.random

As mentioned earlier, there are times when we want certain randomness in the values initialized in an array. The np.random.random function serves the purpose. The purpose of this function is to return random values from a continuous uniform distribution in the range [0.0, 1.0), i.e., the values returned range from 0.0 to 1.0 (1.0 excluded), and all the values have an equal chance of being selected. We can use this function to return a single number or an array of a given shape by providing the shape of the array as an input parameter.

>>> np.random.random()
0.8617819947043581
>>> np.random.random((3,2))
array([[0.85452623, 0.31364848],
[0.16947569, 0.27342105],
[0.49678692, 0.44941224]])

Conclusion

We have seen a lot of ways to introduce placeholders into arrays in numpy. The sheer number of different functions present tells us that using placeholders in numpy as a very critical task. I prefer using np.empty as it is just a bit faster, and then I can assign values accordingly. Another thing to note is that the np.eye function always returns a two-dimensional array. Which one did you like the most?

Before you go…

Connect with me on Instagram and Facebook.
If you liked this story, hit the clap button (you can clap up to 50 times with just one button!) and follow me for more such articles!

Share, recommend, and comment away! Engagement helps us communicate and be more human!

--

--

Arpit Omprakash
Byte-Sized-Code

I'm a Programming and Statistics enthusiast studying Biology. To find out if we have common interests, have a look at my thoughts: https://aceking007.github.io/