The idea is to simply extend the dimensionality. There is a way to subtract a shape (n,3) array w X so that each row is subtracted from the whole array without explicitly using a loop.

A young Mcnamara Chiwaye is photographed diminishing a suit, Harare, Zimbabwe winter 2017.

The last axis would be the rows.(5, 2, 3)Let's say you have a large dataset with numerical values and you wanted to remove all 0 values look at another example array([[0.,0.,0.]]) The code below actually achieves that.

The cool thing about NumPy is reshaping

Reshaping a NumPy array. Jupyter notebook instance running ipython

You can find the minimum value within each column by specifying axis=0.With a three-column array, you will get four values as your result.

Comparison operators syntax in NumPy follows a similar syntax with R. VBA, DAX etc.

Conditional Statements and Logic

We remember from reshaping that indexing can be used to Transpose arrays

Business Case

We can imagine we work for a financial institution and you have been asked to find unique customers using python. I am excited you like your new role.

Your first code looks like below.

Practically you have done two things. The first thing is that you are reproducible. This is good. It means your work can actually scale across multiple systems.

NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further.

Numpy is the python Ecosystem?

Numpy stacks and reshaping
Iterating a dictionery

Love Coding, Keep coding.

use isin numpy.select condition

Live Coding

import pandas as pd
import numpy as np
import io

dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
print df['Value'].values
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]

ts = pd.Series(df['Value'].values, index=df['Date'])
pd.Series( [i for i in range(20)], pd.date_range('2016-01-02', periods=20, freq='D'))
Happy Coding!

References

--

--