Swifter — automatically efficient pandas apply operations

Jason Carpenter
Apr 17, 2018 · 5 min read

What do you do?


$ pip install -U pandas
$ pip install swifter
import pandas as pd
import swifter
myDF['outCol'] = DF['inCol'].swifter.apply(anyfunction)


def bikes_proportion(x, max_x):
return x * 1.0 / max_x
data['bike_prop'] = data['bikes_available'].swifter.apply(
def convert_to_human(datetime):
return datetime.weekday_name + ', the ' + str(datetime.day) + 'th day of ' + datetime.strftime("%B") + ', ' + str(datetime.year)
data['humanreadable_date'] = data['date'].swifter.apply(
# Parallel processing b/c if-else statement makes it non-vectorized
def gt_5_bikes(x):
if x > 5:
return True
return False
# computes in 13.8s
data['gt_5_bikes'] = data['bikes_available'].swifter.apply(gt_5_bikes)

# Vectorized version
def gt_5_bikes_vectorized(x):
return np.where(x > 5, True, False)
# computes in 231ms
data['gt_5_bikes_vec'] = data['bikes_available'].swifter.apply(


Swifter vectorizes when possible for ≥100x speed increase
df['date'].apply(pd.to_datetime) # very slowpd.to_datetime(df['date']) # vectorized - very fastdf['date'].swifter.apply(pd.to_datetime) # also vectorized - very fast
X is 1, 10, 100, 1000, …
Swifter converges to pandas apply on small datasets and dask parallel processing on large ones

