Photo by JJ Ying on Unsplash

Getting Started with NumPy Arrays

Nishitha Kalathil

--

Welcome to the first part of “The Numpy Nerd: A Guide to Mastering Arrays and Awkward Social Interactions” tutorial series! In this part, I’ll introduce you to NumPy and cover the basics of using the library for scientific computing in Python.

NumPy, short for “Numerical Python,” is a popular library for scientific computing in Python. It provides powerful data structures, such as arrays and matrices, and a wide range of operations for manipulating them. NumPy is particularly useful for numerical computations, data analysis, and machine learning applications. Topics covered in this story are the following:

Installing and importing Numpy

What are arrays

Difference between a Python list and a NumPy array

Creating Numpy arrays

Array Dimensions

Attributes of an array

Indexing and Slicing arrays

Adding and removing elements

So let’s begin…………………………………..

Installing and importing NumPy

To get started with NumPy, you’ll first need to install the library. NumPy can be installed using pip, the Python package manager or Anaconda. Open up your terminal or command prompt and type the following command:

pip install numpy

if you are using Jupyter Notebook/ Google colab, you have to add a “!” at the begning.

!pip install numpy

If you are using Anaconda, use the following code

conda install numpy

Once NumPy is installed, you can import it into your Python script or interactive session using the import statement:

import numpy as np

In the above statement, we’ve imported NumPy and given it the alias “np”. This is a common convention used in the Python community when working with NumPy.

Now that we have NumPy installed and imported, we can start creating and manipulating arrays.

What are arrays ?

An array is a collection of elements, all of the same type, arranged in a specific order. Arrays are like a set of boxes that you can put things in. And NumPy arrays are like a set of boxes that you can put things in, but you can also do really cool math stuff with them, like adding or multiplying all the things in the boxes together. It’s like having a math-powered toy chest!

Need a more interesting explanation?

Arrays are like a bunch of friends who you can count on to help you with something. And NumPy arrays are like a bunch of super-smart friends who not only help you, but also do all the math for you, so you can focus on the fun stuff, like building a giant tower of friends (or data)!

Difference between a Python list and a NumPy array

  1. Data type: In Python, a list can contain elements of different data types, such as integers, strings, and floats. In contrast, a NumPy array can only contain elements of the same data type, which is typically a numerical data type such as int or float.
  2. Memory usage: NumPy arrays are more memory-efficient than Python lists because they are implemented in C and can take advantage of low-level optimizations. In addition, NumPy arrays can be processed more quickly by CPUs and GPUs, which makes them ideal for scientific computing and data analysis.
  3. Mathematical operations: NumPy arrays are designed for numerical operations and come with a built-in library of mathematical functions that can operate on entire arrays at once, which is known as vectorization. In contrast, Python lists require looping over each element to perform mathematical operations, which can be slow for large datasets.
  4. Size and dimensions: Python lists can be of any size and can contain nested lists, which can have varying sizes. NumPy arrays, on the other hand, are typically used for large datasets with a fixed size and a fixed number of dimensions.

Overall, NumPy arrays are more efficient and easier to work with for numerical operations, while Python lists are more flexible and can contain elements of different data types and sizes.

Creating Numpy arrays

Welcome to the magical world of Numpy arrays! It’s like playing with a set of Legos, but instead of building a spaceship, you’re constructing mathematical models that would make Einstein proud. So let’s put on our thinking caps and get ready to create arrays that are so powerful, they could solve a Rubik’s Cube in under 30 seconds!

Creating NumPy arrays is easy and can be done using a variety of methods, including manually specifying the values of the array, using NumPy functions to create arrays with specific properties, and reading in data from external sources. Once you have created a NumPy array, you can use NumPy’s extensive suite of array operations to manipulate, analyze, and visualize your data in a variety of ways.

np.array()

In NumPy, arrays are created using the np.array() function. Here's an example:

import numpy as np

# Create a one-dimensional array
>>> arr1d = np.array([1, 2, 3, 4, 5])
>>> print(arr1d)
[1 2 3 4 5]

# Create a two-dimensional array
>>> arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> print(arr2d)
[[1 2 3]
[4 5 6]
[7 8 9]]

In the above code, we’ve created a one-dimensional array arr1d containing the numbers 1 to 5, and a two-dimensional array arr2d containing the numbers 1 to 9 arranged in a 3x3 grid. We can access individual elements of an array using indexing and slicing, just like we would with a list in Python.

np.zeros()

Creating arrays with zeros

# Creating a 1D array of zeros
>>> a = np.zeros(3)
>>> print(a)
[0. 0. 0.]

# Creating a 2D array of zeros
>>> b = np.zeros((3, 3))
>>> print(b)
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

np.ones()

Creating arrays with ones(1)

# Creating a 1D array of ones
>>> a = np.ones(2)
>>> print(a)
[1. 1.]

# Creating a 2D array of ones
>>> b = np.ones((2, 4))
>>> print(b)
[[1. 1. 1. 1.]
[1. 1. 1. 1.]]

np.eye()

np.eye(n, m) defines a 2D identity matrix. The elements where i=j (row index and column index are equal) are 1 and the rest are 0, as such:

>>> np.eye(3)
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
>>> np.eye(3, 5)
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.]])

np.arange()

numpy.arange creates arrays with regularly incrementing values.

>>> np.arange(10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.arange(2, 10, dtype=float)
array([2., 3., 4., 5., 6., 7., 8., 9.])
>>> np.arange(2, 3, 0.1)
array([2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])

np.linespace()

numpy.linspace will create arrays with a specified number of elements, and spaced equally between the specified beginning and end values.

>>> np.linspace(1., 4., 6)
array([1. , 1.6, 2.2, 2.8, 3.4, 4. ])

Array Dimensions

Why did the NumPy array feel depressed? Because it had too many dimensions to deal with.

You might occasionally hear an array referred to as a “ndarray,” which is shorthand for “N-dimensional array.” An N-dimensional array is simply an array with any number of dimensions. You might also hear 1-D, or one-dimensional array, 2-D, or two-dimensional array, and so on .In NumPy, arrays can have any number of dimensions, from 0 (a scalar) to N (a multi-dimensional array).

Vectors : One-dimensional arrays

Matrices : Two-dimensional arrays

Tensors : 3-D or higher dimensional arrays

Axis

In NumPy, the term “axis” refers to the dimensions of an ndarray. The number of axes of an ndarray is referred to as its “ndim” attribute. For example, a 1D array has one axis, a 2D array has two axes, and so on.

The axes of an ndarray can be visualized as follows:

  • For a 1D array, the single axis is typically represented by a horizontal line.
  • For a 2D array, the first axis is the vertical axis (rows) and the second axis is the horizontal axis (columns).
  • For a 3D array, the first axis is the depth axis, the second axis is the vertical axis (rows), and the third axis is the horizontal axis (columns).

When performing operations on ndarrays, it’s important to understand which axis or axes are being operated on. Many NumPy functions and methods have an “axis” parameter that allows you to specify the axis or axes to operate on.

For example, consider the following 2D array. If we want to calculate the sum of each row, we would set the axis parameter to 1:

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]
row_sums = arr.sum(axis=1)
print(row_sums) # Output: [ 6 15 24]

If we want to calculate the sum of each column, we would set the axis parameter to 0:

col_sums = arr.sum(axis=0)
print(col_sums) # Output: [12 15 18]

Attributes of an array

In numpy, an array has several attributes that provide information about its properties. Here are some of the most commonly used attributes:

  • ndarray.ndim will tell you the number of axes, or dimensions, of the array.
arr = np.array([1, 2, 3])
print(arr.ndim) # Output: 1
arr = np.array([[1, 2, 3],[4,5,6]])
print(arr.ndim) # Output: 2
  • ndarray.size will tell you the total number of elements of the array. This is the product of the elements of the array’s shape.
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr.size) # Output: 9
  • ndarray.shape will display a tuple of integers that indicate the number of elements stored along each dimension of the array. If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is (2, 3).
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr.shape) # Output: (3, 3)
  • ndarray.dtype This attribute gives the data type of the elements in the array.
arr = np.array([1, 2, 3])
print(arr.dtype) # Output: int64

Specifying your data type

  • While the default data type is floating point (np.float64)
  • you can explicitly specify which data type you want using the “dtype” keyword.
>>> x = np.ones(5)
>>> print(type(x[1]))
<class 'numpy.float64'>

>>> x = np.ones(5, dtype=np.int32)
>>> print(type(x[1]))
<class 'numpy.int32'>
  • ndarray.itemsize : This attribute gives the size of each element in the array in bytes.
arr = np.array([1, 2, 3])
print(arr.itemsize) # Output: 8
  • ndarray.data : This attribute gives the buffer containing the actual elements of the array.
arr = np.array([1, 2, 3])
print(arr.data) # Output: <memory at 0x7fcb7380bc70>

Indexing and Slicing arrays

One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python sequences.

Indexing allows you to select individual elements from an array by specifying their position within the array. In numpy, indexing works similarly to indexing in Python lists. You can use square brackets to access elements of an array, with the index starting from 0.

arr = np.array([1, 2, 3, 4, 5])

# Accessing the first element of the array
print(arr[0]) # Output: 1

# Accessing the third element of the array
print(arr[2]) # Output: 3

Indexing with 2D array

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Accessing the element at row 1, column 2
print(arr[1, 2]) # Output: 6

Indexing with 3D arrays

arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

# Accessing the element at depth 1, row 0, column 1
print(arr[1, 0, 1]) # Output: 6

Slicing

Slicing allows you to access a subset of an array by specifying a range of indices. In numpy, you can use the colon (:) operator to specify a range of indices.

arr = np.array([1, 2, 3, 4, 5])

# Slicing 1D array to get the first three elements
print(arr[:3]) # Output: [1 2 3]

# Slicing the array to get the last two elements
print(arr[-2:]) # Output: [4 5]

# Slicing a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr_2d[:2, 1:]) # Output: [[2 3] [5 6]]

# Slicing a 3D array
arr_3d = np.array([[[1, 2], [3, 4], [5, 6]], [[7, 8], [9, 10], [11, 12]]])
print(arr_3d[:, :2, 1:]) # Output: [[[2], [4], [6]], [[8], [10], [12]]]
data = np.array([1, 2, 3])
data[1]
data[0]
data[0:2]
data[1:]
data[-2:]

Adding and removing elements

In NumPy, you can add and remove elements from arrays in several ways. Here are some of the most common methods:

Adding Elements:

  • numpy.append(arr, values, axis=None) - adds one or more elements to the end of the array. The arr parameter specifies the array to which the elements are added, values is the array or scalar to be appended, and axis specifies the axis along which the values should be appended. If axis is not specified, the array is flattened before the append operation.
  • numpy.insert(arr, obj, values, axis=None) - inserts one or more elements into the array at a specified position. The arr parameter specifies the array into which the elements are inserted, obj is the index or indices at which the values should be inserted, values is the array or scalar to be inserted, and axis specifies the axis along which the values should be inserted. If axis is not specified, the array is flattened before the insert operation.
import numpy as np

a = np.array([1, 2, 3])

b = np.append(a, [4, 5, 6])
# Output: array([1, 2, 3, 4, 5, 6])

c = np.insert(a, 1, [4, 5, 6])
# Output: array([1, 4, 5, 6, 2, 3])

Removing Elements:

  • numpy.delete(arr, obj, axis=None) - removes one or more elements from the array. The arr parameter specifies the array from which the elements are removed, obj is the index or indices of the values to be removed, and axis specifies the axis along which the values should be removed. If axis is not specified, the array is flattened before the delete operation.
a = np.array([1, 2, 3, 4, 5])

b = np.delete(a, [2, 3])
# Output: array([1, 2, 5])

In the next part, [Maximizing the Power of NumPy: Advanced Array Operations for Data Science] we’ll dive deeper into NumPy’s powerful Shape Manipulation, Iterating through an array, Stacking, Splitting, Changing Dimensions and Sorting Stay tuned!!!!

Social Interaction Tip: If you’re struggling to start a conversation with someone, try asking them if they’re a Python or a Java person. Then you can talk about programming languages and maybe even collaborate on a project.

The notes are prepared from Numpy Official Website

--

--