Stacking and Splitting NumPy arrays like a Pro: Part 1

Published in

Nerd For Tech

3 min readMar 21, 2023

Understand how to stack NumPy arrays in the first part of this 2 part series.

Numpy is one of the most important libraries for data science and also it provides with most of the functions needed to work with data. So, mastering the ins and outs of this library is required. This is part 1 of 2 and in this article, we are going to see how to stack numpy arrays. It allows you to join 2 numpy arrays in the axis that is specified.

This article assumes basic knowledge of working with numpy. If not, read the following article and make yourself familiar with basics of numpy.

A comprehensive guide to get you started with NumPy

Learn the basics of NumPy through this guide.

devsheth09.medium.com

What is numpy.stack?

Numpy’s stack function is used to join multiple numpy arrays along a new axis and return a numpy array. One of the main requirement to keep in mind is that arrays should have same shape and dimension.

The parameters of np.stack are

arrays (mandatory) — The arrays that we want to stack, they must be of the same shape.
axis (optional) — The axis along which we want to stack the arrays.
out (optional) — The destination array where we want our result to be. It must be of the same shape that we expect the output array to be.

Stacking row-wise vs column-wise

When we keep the axis parameter 0 which is also the default value, the arrays are stacked on top of each other i.e. row-wise. Whereas if we keep the axis parameter as 1, the arrays are stacked side-by-side i.e. column wise.

Let’s understand this better with some examples.

>>> import numpy as np
>>> a = np.array([1,2,3])
>>> b = np.array([4,5,6])

# First we will stack on top of each other, which is the default behavior
>>> np.stack([a,b])
array([[1, 2, 3],
       [4, 5, 6]])

# Now we will use axis=0, which should also give us the same output as above.
# Even if we want to use default behavior, it is always better 
# to mention the value we want to use.
>>> np.stack([a,b], axis=0)
array([[1, 2, 3],
       [4, 5, 6]])

# Now we will use axis=1, which will stack them side-by-side.
>>> np.stack([a,b], axis=1)
array([[1, 4],
       [2, 5],
       [3, 6]])

The above examples are for arrays in 1D, let’s also see couple of examples for 2D.

>>> a = np.array([[1,2,3], [4,5,6]])
>>> b = np.array([[7,8,9],[10,11,12]])

# Stacking 2D arrays on top of each other.
>>> np.stack([a,b], axis = 0)
array([[[ 1,  2,  3],
        [ 4,  5,  6]],
       [[ 7,  8,  9],
        [10, 11, 12]]])

# Stacking 2D arrays side-by-side.
>>> np.stack([a,b], axis = 1)
array([[[ 1,  2,  3],
        [ 7,  8,  9]],
       [[ 4,  5,  6],
        [10, 11, 12]]])

Numpy hstack and vstack

numpy hstack function takes 2 arrays with same number of rows and joins them horizontally. The number of columns in these arrays need not be same, they can be different and it will stack without any issue.

>>> a = np.array([[1,1], [1,1]])
>>> b = np.array([[2,2,2,2], [2,2,2,2]])
>>> np.hstack([a,b])
array([[1, 1, 2, 2, 2, 2],
       [1, 1, 2, 2, 2, 2]])

Here, hstack took array b and joined it horizontally to array a.

numpy vstack function takes 2 arrays with same number of columns and joins them vertically. Similar to hstack, here number of rows can be different and it will stack just as well.

>>> a = np.array([[1,1,1], [1,1,1]])
>>> b = np.array([[2,2,2], [2,2,2], [2,2,2], [2,2,2]])
>>> np.vstack([a,b])
array([[1, 1, 1],
       [1, 1, 1],
       [2, 2, 2],
       [2, 2, 2],
       [2, 2, 2],
       [2, 2, 2]])

Here, vstack took array b and stacked it vertically with array.

Both the functions are quite intuitive to understand from their name. hstack joins horizontally or side-by-side, whereas vstack joins array vertically or on top of each other.

That’s it for this part and in the next part I will explain how to split numpy arrays effectively.

Thanks for reading! If you liked this, use the clap button and if you have any suggestion, do comment. Make sure to follow to read upcoming articles on NumPy, Pandas, SQL and all things related to Data Science.

Connect with me: LinkedIn
Checkout my other projects: Github
Follow me on Medium