# NumPy for Data Science: Part 2

**Array Indexing and Slicing**

Hello! Welcome to the 2nd tutorial of NumPy: **Array Indexing and Slicing**. In this tutorial, I discuss the following things with examples.

## Topics discussing

**Definitions of Indexing and Slicing****1D, 2D and 3D Array Indexing and Slicing****Boolean-Valued Indexing****The Mutability of an Array****Copies and References of a NumPy Slice****The IndexError**

## Dependencies

I assume that you have already read ** NumPy for Data Science: Part 1**. If you haven’t, please read it first before reading this one. You should also be familiar with the basics of Python programming language and its object-oriented programming (OOP) concepts.

# Indexing and Slicing: Introduction

The elements of an ndarray object (simply called an **array**) can be accessed and modified by indexing or slicing. The **index **refers to the location of an element in an array.** **Array indexing uses the standard square bracket **[ ]** notation. Within the square bracket, a variety of different index formats are used for different types of element selection. Ndarray objects follow ** zero-based indexing** meaning that 1st element has the index 0, 2nd element has the index 1, and so on.

** Slicing** allows you to extract portions of an array to generate new arrays. Slices are specified using the

**:**notation. They are used to select ranges and sequences of elements.

First, I discuss indexing and slicing for 1-dimensional arrays, then 2-dimensional and 3-dimensional arrays.

# Indexing and Slicing: 1D-Arrays

Along a single dimension (axis):

- Integers are used to select single elements.
- A list of integers is used to select non-consecutive, multiple elements.
- Slices are used to select ranges and sequences of elements.
- Positive integers are used to index elements from the beginning of the array (index starts at 0).
- Negative integers are used to index elements from the end of the array, where the last element is indexed with –1, the second to last element with –2, and so on.
- A range of elements can be selected using the expression
**m:n**, which selects elements starting with**m**and ending with**n − 1**(**n**th element is not included). - The expression
**m:n:p**, which selects every**p**element between**m**and**n**. If**p**is negative, elements are returned in reversed order.

Now, I discuss the examples of array indexing and slicing. In the following examples, **a** refers to the following ndarray object (array). I use this array to illustrate the indexing and slicing of 1d arrays.

Now, consider the following examples of indexing and slicing of 1d arrays.

**a[m]**selects element at index**m**, where**m**is an integer starting from 0.

**a[-m]**selects the**n**th element from the end of the array, where the last element is indexed with –1, the second to last element with –2, and so on.

**a[[w, x, y, …]]**selects multiple elements at index w, x, y, and so on. If the elements at index w, x, y are consecutive, we can use**a[m:n]**notation instead, to select the elements easily (which is discussed next).

**a[m:n]**selects a range of consecutive elements starting with**m**and ending with**n − 1**(**n**th element is not included).

**a[:]**selects all the elements in the given axis.

**a[:n]**selects elements starting with index 0 and going up to index n − 1 (**n**th element is not included).

**a[m:]**selects elements starting with index**m**(integer) and going up to the last element in the array.

**a[m:n:p]**selects elements with index**m**through**n**(exclusive), with increment**p**.

**Case 1:** When p=1, **a[m:n:p] **is exactly the same as** a[m:n]**.

**Case 2:** When p=3,

**Case 3:** When p=-1, **a[: : -1]** selects all the elements in reverse order.

# Indexing and Slicing: 2D-Arrays

** With multidimensional arrays, element selections like those previously introduced can be applied on each axis (dimension)**. The result is a reduced array where each element matches the given selection rules. In the following examples,

**a**refers to the following ndarray object. You can create it by using the NumPy

*arange()*function and

*reshape()*method of ndarray class. This time, the array

**a**is two-dimensional. I use this array to illustrate the indexing and slicing of 2d arrays.

- Element selections like those previously introduced can be applied on each axis (dimension).
- We need to use two indexes, each separated by a comma, to select elements in a 2D-array. The format is
**a[**…**,**…**]**.

Now, consider the following examples of indexing and slicing of 2d arrays.

**Example 1:**Selecting the first row of the array**a**

To select the first row, we need to use two indexes because there are two dimensions. Elements in dimension 0 (this time, rows) can be selected by specifying appropriate notations *before* the comma. To select the first row in the axis 0, index 0 is used. Elements in dimension 1(this time, columns) can be selected by specifying appropriate notations *after* the comma. To select all the elements (columns) in axis 1, the **:** notation is used.

**Example 2:**Selecting the last column of the array**a**

**Example 3:**

**Example 4:**

# Indexing and Slicing: 3D-Arrays

Before discussing indexing and slicing of 3D-arrays, let me show how elements are arranged in a 3D-array.

- Element selections like those previously introduced can be applied on each axis (dimension).
- We need to use three indexes, each separated by a comma, to select elements in a 3D-array. The format is
**a[**…**,**…**,**…**]**.

Now, consider the following examples of indexing and slicing of 3d arrays.

**Example 1:**Selecting the 2nd matrix of the array**a**

**Example 2:**Selecting the 1st and last columns of each matrix.

**Example 3:**Selecting the element which has the value**16**

**Example:**Selecting the elements which have the values**15**and**16**

# Indexing and Slicing: Boolean-Valued Indexing

An alternative way to select the elements in an array is to use the conditions and Boolean operators. We do indexing using a Boolean-valued array. In this case, each element (with values **True** or **False**) indicates whether or not to select the element from the list with the corresponding index. That is, if element **n** in the indexing array of Boolean values is **True**, then element **n** is selected from the indexed array. If the value is **False**, then element **n** is not selected.

Let me show you some examples. The array **a** refers to the following 2D-array.

Now, imagine that you want to select the elements which are greater than 10. The first step is to create the corresponding Boolean-valued array.

Now, we can use this boolean-valued array to select the elements which are greater than 10.

In this example, I have used only one condition. How about combining two conditions? How can you select elements which are greater than 5 and less than 10? To answer this, we can use the NumPy **logical_and()** function which computes the truth value of x1 **AND** x2 element-wise where x1, x2 are boolean-valued arrays.

Now, we can use this boolean-valued array to select the elements which are greater than 5 and less than 10.

# NumPy arrays are mutable

Numpy Arrays are **mutable**, which means that you can change the value of an element in the array after it has been initialized. The following example shows that the value of the first element of the array was changed to 100.

# NumPy Slice is a reference or view

The result of a NumPy slice is a ** reference** (or

**) and not a copy of the original array. When you modify a slice, you actually modify the underlying array. Consider the following example.**

*view*If you want to create a slice as a copy of the original array, you can pass the slice into the NumPy **copy()** function or **array()** function.

# Boolean-valued indexing returns copies

Unlike arrays created by using slices, the arrays returned using Boolean valued indexing are not references but rather new ** independent arrays** (

**).**

*copies*# The IndexError

Notice that the following ndarray is actually rank 2, not rank 1. Notice the double squared brackets **[[ ]]**.

If we use only one index as **a[1]**, an index error occurs.

However, **a[0]** returns a 1D-array (the entire row of the above array)

To access a single element, we need to use two indexes because there are two dimensions. The returned element is also an array, not an integer or float.

This tutorial was designed and created by *Rukshan Pramoditha**, *the Author of Data Science 365 Blog.

# Technologies used in this tutorial

- Python
- NumPy
- Jupyter Notebook

2020–05–09