Python Loop Replacement: Handling Conditional Logic (PyTorch & NumPy)
Not your SQL Select clause — Using Where and Select to vectorize conditional logic
In a previous article, “Accelerate Numerical Calculations in NumPy With Intel oneAPI Math Kernel Library”, I address why NumPy constructs, powered by Intel oneAPI, can achieve outsized performance and code readability and code maintainability advantages over a typical “roll your won” Python code segment.
In this article I want to demonstrate how you can vectorize a loop even though it contains tricky conditional logic.
Summarized: Here are the steps you shoudl take.
- Find Large loop iteration loop
2. If it contains conditional logic, consider pytorch.where, np.where or np.select
3. else try to find a NumPy or PyTorch replacement using UFuncs, aggregations, etc
Numpy.where & PyTorch.where
One thing that could prevent us from effectively getting vector performance when converting a loop to a vector approach is when the original loop has if then else statements in it — called conditional logic
The Numpy & PyTorch Where allows us to tackle conditional loops in a fast vectorized way
Apply conditional logic to an array to create a new column orupdate contents of an existing column
Syntax:
- numpy.where(condition, [x, y, ]/)
- Return elements chosen from x or y depending on condition.
To understand what NumPy where does, look at the simple example below See a simple example below to add 50 to all elements currently greater than 5
a = np.arange(10)
np.where(a > 5, a + 50, a )
# if a > 5 then return a + 50
# else return a
output:
array([ 0, 1, 2, 3, 4, 5, 56, 57, 58, 59])
This could come n handy for many AI applications, but let’s choose labeling data
There may be better wyas to binarize data but here is a simple example of converting conrinuous data into categorical values
arr = np.array([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])
Let’s say all values 10 and above represent a medical parameter threshold that indicates further testing, while values below 10 indicate normal range
We might like to print the values as words such as [‘More Testing’, ‘Normal’, ‘More Testing’, ‘More Testing’, …]
arr = np.array([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])
np.where(arr < 10, 'Normal', 'More Testing')
output:
array(['More Testing', 'Normal', 'More Testing', 'More Testing',
'More Testing', 'Normal', 'Normal', 'More Testing'], dtype='<U12')
Or we could binarize data for use in a classifier:
# Simple Numpy Binarizer Discretizer
# convert continous data to discrete integer bins
arr = np.array([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])
print(np.where(arr < 6, 0, np.where(arr < 12, 1, 2)))
output:
[1 0 2 2 2 1 0 2]
NumPy where can be used to create index masks so that you can select or update masked items from an array.
In this example, I want to find all(rows, cols) of (all **multiples of **12 or all multiples of 9) in a 10x10 multiplication table and make all other values 0. Preserve the first row and first column as readable indexes for the table as follows:
## one solution - preserves the indexing edges for easy checking
res = 0
re((MultiplicationTable%12 == 0) | (MultiplicationTable%9 == 0), MultiplicationTable, 0)
res[0,:] = MultiplicationTable[0,:]
res[:,0] = MultiplicationTable[:,0]
res
output:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[ 2, 0, 0, 0, 0, 12, 0, 0, 18, 0],
[ 3, 0, 9, 12, 0, 18, 0, 24, 27, 0],
[ 4, 0, 12, 0, 0, 24, 0, 0, 36, 0],
[ 5, 0, 0, 0, 0, 0, 0, 0, 45, 0],
[ 6, 12, 18, 24, 0, 36, 0, 48, 54, 60],
[ 7, 0, 0, 0, 0, 0, 0, 0, 63, 0],
[ 8, 0, 24, 0, 0, 48, 0, 0, 72, 0],
[ 9, 18, 27, 36, 45, 54, 63, 72, 81, 90],
[10, 0, 0, 0, 0, 60, 0, 0, 90, 0]])
Numpy Where applied to California Housing data
In AI context, this could be applying categorical classifier to otherwise continuous values. For example, California housing dataset the target price variable is continuous. in the following fictitious scenario, A new stimulus package is considered whereby new house buyers will be given a coupon worth 50,000 off toward purchase of houses in California whose price (prior to coupon) is between 250,0000 and 350,000. Other prices will be unaffected. Generate array with the adjusted targets
Lets compare the naive Python loop versus the NumPy Where clause — examine for readability, maintainability, speed, etc
# Ficticious scenario:
from sklearn.datasets import fetch_california_housing
california_housing = fetch_california_housing(as_frame=True)
X = california_housing.data.to_numpy()
buyerPriceRangeLo = 250_000/100_000
buyerPriceRangeHi= 350_000/100_000
T = california_housing.target.to_numpy()
t1 = time.time()
timing = {}
New = np.empty_like(T)
for i in range(len(T)):
if ( (T[i] < buyerPriceRangeHi) & (T[i] >= buyerPriceRangeLo) ):
New[i] = T[i] - 50_000/100_000
else:
New[i] = T[i]
t2 = time.time()
plt.title( "California Housing Dataset - conditional Logic Applied")
plt.scatter(T, New, color = 'b')
plt.grid()
print("time elapsed: ", t2-t1)
timing['Loop'] = t2-t1
Next, examine the NumPy Where approach:
t1 = time.time()
#############################################################################
### Exercise: Addone moddify code below to compute same results as above loop
New = np.where((T < buyerPriceRangeHi) & (T >= buyerPriceRangeLo), T - 50_000/100_000, T )
##############################################################################
t2 = time.time()
plt.scatter(T, New, color = 'r')
plt.grid()
print("time elapsed: ", t2-t1)
timing['np.where'] = t2-t1
print("Speedup: {:4.1f}X".format( timing['Loop']/timing['np.where']))
Code more readable, more maintainable, faster 13X
As you can see, we generated the same data with NumPy where as we did with the original loop but we did so 13X faster (the speedup amount may vary a bit).
NumPy Select statement:
The select statement is available in NumPy , but not yet in PyTorch, although I will provide a code snippet which can emulate the select clause using PyTorch. The select statement handle much more complex logic conditions as compared to the where statement.
Apply conditional logic to an array to create a new column orupdate contents of an existing column. This method handles more complex conditional sceanrios than numpy where.
Syntax:
- [numpy.select(condlist, choicelist, default=0)[source]
- Return an array drawn from elements in choicelist, depending on conditions.
function return an array drawn from elements in choicelist, depending on conditions.
This is very useful function for handing conditionals that otherwise slow down and map or apply, or else add complexity in reading the code
First we will create some new data
import numpy as np
import time
BIG = 10_000_000
np.random.seed(2022)
A = np.random.randint(0, 11, size=(BIG, 6)
Now I will apply some crazy logic to updating various columns of the array
timing = {}
t1 = time.time()
for i in range(BIG):
if A[i,4] == 10:
A[i,5] = A[i,2] * A[i,3]
elif (A[i,4] < 10) and (A[i,4] >=5):
A[i,5] = A[i,2] + A[i,3]
elif A[i,4] < 5:
A[i,5] = A[i,0] + A[i,1]
t2 = time.time()
baseTime = t2- t1
print(A[:5,:])
print("time: ", baseTime)
timing['Naive Loop'] = t2 - t1
output:
[[ 0 1 1 0 7 1]
[ 2 8 0 5 9 5]
[ 3 8 0 3 6 3]
[ 0 10 10 1 2 10]
[ 5 7 5 1 7 6]]
time: 5.937685012817383
Try Vectorizing with masks
Just remove the references to i and remove the loop, create mask for each condition
# Try Vectorizing simply
t1 = time.time()
mask1 = A[:,4] == 10
A[mask1,5] = A[mask1,2] * A[mask1,3]
mask2 = ((A[:,4].any() < 10) and (A[:,4].any() >=5))
A[mask2,5] = A[mask2,2] + A[mask2,3]
mask3 = A[:,4].any() < 5
A[mask3,5] = A[mask3,0] + A[mask3,1]
t2 = time.time()
print(A[:5,:])
print("time :", t2-t1)
fastest_time = t2-t1
Speedup = baseTime / fastest_time
print("Speed up: {:4.0f} X".format(Speedup))
timing['Vector Masks'] = t2 - t1
output:
[[ 0 1 1 0 7 1]
[ 2 8 0 5 9 5]
[ 3 8 0 3 6 3]
[ 0 10 10 1 2 10]
[ 5 7 5 1 7 6]]
time : 0.23482632637023926
Speed up: 25 X
Next, try NumPy.select — Much cleaner logic
Put condition inside a list of tuples, put execution choice inside a list of tuples, and choose a default action
condition = [ (A[:,4] < 10) & (A[:,4] >= 5),
( A[:,4] < 5)]
choice = [ (A[:,2] + A[:,3]),
(A[:,0] + A[:,1] ) ]
default = [(A[:,2] * A[:,3])]
A[:,5] = np.select(condition, choice, default= default )
output:
[[ 0 1 1 0 7 1]
[ 2 8 0 5 9 5]
[ 3 8 0 3 6 3]
[ 0 10 10 1 2 10]
[ 5 7 5 1 7 6]]
time : 0.4723508358001709
Speed up: 13 X
plt.figure(figsize=(10,6))
plt.title("Time taken to process {:,} records in seconds".format(BIG),fontsize=12)
plt.ylabel("Time in seconds",fontsize=12)
plt.xlabel("Various types of operations",fontsize=14)
plt.grid(True)
plt.xticks(rotation=-60)
plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))
Note: 13X speedup over Naive Python loop when using this NumPy.select in this simple example.
Next article — I will use the NumPy select to drastically speedup a Pandas Apply statement that is plagued with lots of conditional logic.
Play with these concepts on the Intel Developer Cloud:
Take the opportunity to play with replacing loop bound aggregations in your own code with NumPy aggregation functions instead.
For a sandbox to play in — register for a free account on the Intel Developer Cloud (cloud.intel.com), sign in and play by clicking on the icon in the lower left:
Then Launch JupyterLab on the shared access node in the icon on the right — see below:
Code
The code for this article and the rest of the series is located on github. For this article experiment with the file: 08_05_NumPy_Where_Select.ipynb
Related Articles:
Article 1:
Accelerate Numerical Calculations in NumPy With Intel oneAPI Math Kernel Library. Explore the reasons why replacing inefficient Python loops with NumPy or PyTorch constructs is a great idea.
Article 2:
Python Loop Replacement: NumPy Optimizations Simple Stuff — ND array creation using NumPy, PyTorch, DPCTL. Explore simple ways of creating , converting and transforming Lists into NumPy NDarrays — a very basic getting started.
Article 3:
Article 4:
Replacing Python loops: Aggregations and Reductions. How to replace slow python loops by strategic function equivalents for aggregating data.
Article 5:
Replacing Python loops: Fancy Slicing and Broadcasting. Here I address fancy slicing and broadcasting to take advantage of key optimizations for loop replacement.
Article 6: (current article)
Article 7:
Loop Replacement Strategies: Applications to Pandas Apply. Examine how to accelerate Pandas Apply statement containing conditional logic.
Article 8:
NumPy Functions Composed. Compare Fast Inverse Square Root Method to NumPy ufuncs, Numba JIT, and Cython — Which One Wins?
Intel Developer Cloud System Configuration as tested:
x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 52 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 224
On-line CPU(s) list: 0–223
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Platinum 8480+
CPU family: 6
Model: 143
Thread(s) per core: 2
Core(s) per socket: 56
Socket(s): 2
Stepping: 8
CPU max MHz: 3800.0000
CPU min MHz: 800.0000