Manipulating Multidimensional Numpy Points

Published in

Deep Learning Turkey

5 min readJan 25, 2019

When we deal with data such as time series data, text data, or images, the data might not come with the proper shape we want, so we need to change the shape to apply further steps. This could be the case when using scikit-learn that requires the data to be a 2D tensor/vector.

In this article, I have collected a few Numpy approaches, each of which could be used for the same or different needs to manipulate the data. In other words, the end goal is to give you high-level ideas about how to transform the data into a specific data shape using Numpy.

In brief, tensors are also called data in multidimensional Numpy arrays. When considering one number (a scalar), we can call it a zero-dimensional tensor or a scalar-tensor. An array of numbers is a vector (1D tensor), an array of vectors is a matrix (2D tensor), an array of matrices is a 3D tensor and so on.

Now that we define what a tensor/vector is, let’s first start looking at the reshape method of Numpy. For further use, we need to create one more thing, a three-dimensional array, to use in the examples below.

> x = np.array([[[1,2,3,4,5], [4,5,6,7,8]],
                [[2,4,6,8,12],[3,6,9,12,15]],
                [[3,8,8,9,8], [9,9,10,11,19]]])
> x.shape
(3, 2, 5)

Suppose we need to change the shape of x into the following dimensions with given shapes:

2D array, (3,10) and (2,15)
1D array, (30,)
4D array, (1,3,2,5) and (1,3,5,2)

x is a numpy.ndarray instance, we can use the reshape method directly on it.reshape returns an array with the same data with a new shape. The equivalent funtion is np.reshape. To convert a (3x2x5) vector into (3x10):

> x.reshape(3, 10) # the size must be equal to the size of x,i.e. 30
array([[ 1,  2,  3,  4,  5,  4,  5,  6,  7,  8],
       [ 2,  4,  6,  8, 12,  3,  6,  9, 12, 15],
       [ 3,  8,  8,  9,  8,  9,  9, 10, 11, 19]])# or with parentheses
> x.reshape((3, 10))

Similarly for (2x15), giving the two axes with regard to the size of x is:

> x.reshape(2, 15)
array([[ 1,  2,  3,  4,  5,  4,  5,  6,  7,  8,  2,  4,  6,  8, 12],
       [ 3,  6,  9, 12, 15,  3,  8,  8,  9,  8,  9,  9, 10, 11, 19]])

There are a couple of ways to convert n-dimensional data into a flattened one.

> x.reshape(-1) # or x.reshape(x.size)
array([ 1,  2,  3,  4,  5,  4,  5,  6,  7,  8,  2,  4,  6,  8, 12,  3,  6, 9, 12, 15,  3,  8,  8,  9,  8,  9,  9, 10, 11, 19])

In the preceding expression, we use-1 which allows Numpy to handle the shape so it reshapes the 3D points to a 1D vector. Besides reshape , we’re able to use two different methods in order to flat an array:

> x.flatten()
array([ 1,  2,  3,  4,  5,  4,  5,  6,  7,  8,  2,  4,  6,  8, 12,  3,  6, 9, 12, 15,  3,  8,  8,  9,  8,  9,  9, 10, 11, 19])> x.ravel()
array([ 1,  2,  3,  4,  5,  4,  5,  6,  7,  8,  2,  4,  6,  8, 12,  3,  6, 9, 12, 15,  3,  8,  8,  9,  8,  9,  9, 10, 11, 19])

Next to this, reshape lets us increase the dimension of an array as long as we stick to the size of the array as follows:

# we specify all the numbers in advance and reach the number 30, thus numpy only adds 1.
> x.reshape(1,3,2,5) # or x.reshape((1,) + x.shape)
array([[[[ 1,  2,  3,  4,  5],
         [ 4,  5,  6,  7,  8]],

        [[ 2,  4,  6,  8, 12],
         [ 3,  6,  9, 12, 15]],

        [[ 3,  8,  8,  9,  8],
         [ 9,  9, 10, 11, 19]]]])# to get the number 15
> x.reshape(1,3,2,5)[0,1,1,4]
15# a 4D array
> x.reshape(1,3,5,2) 
array([[[[ 1,  2],
         [ 3,  4],
         [ 5,  4],
         [ 5,  6],
         [ 7,  8]],

        [[ 2,  4],
         [ 6,  8],
         [12,  3],
         [ 6,  9],
         [12, 15]],

        [[ 3,  8],
         [ 8,  9],
         [ 8,  9],
         [ 9, 10],
         [11, 19]]]])

The next tool is np.newaxis. np.newaxis allows you to insert a new axis at a specific position, that is, we are able to solve only the last task i.e. a 4D array along with(1,3,2,5) not (1,3,5,2). Inserting a new axis at the zeroth position is:

> x[np.newaxis].shape  # or x[np.newaxis,:,:,:].shape
(1, 3, 2, 5)> x[np.newaxis]
array([[[[ 1,  2,  3,  4,  5],
         [ 4,  5,  6,  7,  8]],

        [[ 2,  4,  6,  8, 12],
         [ 3,  6,  9, 12, 15]],

        [[ 3,  8,  8,  9,  8],
         [ 9,  9, 10, 11, 19]]]])

Using np.newaxis, we only increase the dimension/axis.

Following this, another technique that is not the main purpose of the concept in general is None. The None value is not a Numpy specific term. A Python statement declaringNoneindicates that the statement doesn’t contain a value. Nonetheless, we can useNone in place of np.newaxis with numpy arrays:

> x[None].shape == x[None, :].shape
True> x[None, :, :, :].shape
(1, 3, 2, 5)> x[None, :, :, :]
array([[[[ 1,  2,  3,  4,  5],
         [ 4,  5,  6,  7,  8]],

        [[ 2,  4,  6,  8, 12],
         [ 3,  6,  9, 12, 15]],

        [[ 3,  8,  8,  9,  8],
         [ 9,  9, 10, 11, 19]]]])

The last unit of this article is the expand_dims method. The following is equivalent to x[np.newaxis,:], x[np.newaxis] , x[np.newaxis, :, :], andx[None,:,:,:] :

> np.expand_dims(x, axis=0).shape
(1, 3, 2, 5)> np.expand_dims(x, axis=0)
array([[[[ 1,  2,  3,  4,  5],
         [ 4,  5,  6,  7,  8]],

        [[ 2,  4,  6,  8, 12],
         [ 3,  6,  9, 12, 15]],

        [[ 3,  8,  8,  9,  8],
         [ 9,  9, 10, 11, 19]]]])

What is important to note that np.expands_dims(), None, and np.newaxis have solely the ability to increase dimensions. All in all, it's worth noting that all the aforementioned techniques enable us to manipulate the shape of data and therefore we should keep them in our arsenal.

📎 References:

Manipulating Multidimensional Numpy Points

Written by Hakan Özler