Arithmetic Operation Inside the String with Python eval() Function

Ketan Sahu
Plumbers Of Data Science
3 min readMay 10, 2021

There are plenty of datasets where the data type is totally messed up. Columns that include an integer or float number might have data type Object or String. A similar problem I faced, while the transformation of data I have to perform an add operation between two columns and print the result in the new column. Though the problem was not so easy as it sounds. The two columns data type was not an integer and the data was in an alphanumeric state. Hence, I can’t even convert the data type to Integer from string.

Image created by Author

In this article, I will discuss the use case, where we will see how to transform alphanumeric data and perform arithmetic operations on it with the help of the Python eval() function.

The Python eval() function allows us to perform arithmetic operations inside the string.

I have created a sample data set to describe the use case. In this sample dataset, the columns books_reading_time and tv_watching_time have data in an alphanumeric state. See below the columns books_reading_time and tv_watching_time by per user.

#cretae new example dataframedataframe = pd.DataFrame({'user_name': [ 'Adam', 'Thomas','Ron','Luuk','Harman','Jonita','Melinia', 'Gracy', 'Johan'],
'books_reading_time': ['1H 40M', '3M', '0', '2H', '1H', '3H', '3H 30M', '0', '2H' ],
'tv_watching_time': ['2H 20M', '45M', '10M', '33M', '2H', '0', '27M', '1H 23M', '2H 21M' ],
})
dataframe
Example dataframe
Example Data frame

You might think that we can convert the columns to time-series datatype, but this operation also does not work because the format (%H %M) is not consistent throughout.

Hence, to add the books_reading_time and tv_watching_time, I first use the replace() function and replace the word H with *60, M with *1, and space ‘ ’ with + respectively. The replace() function will change the alphanumeric data to string data. This is how I will do.

# Replace 'H' with '*60'. Considering 1 hour = 60 min
# Replace 'M' with '*1'. Considering 1 minute = 1 min
# Replace space ' ' with '+'.
dataframe['books_reading_time']= dataframe['books_reading_time'].str.replace("H", '*60').str.replace(' ','+').str.replace('M','*1')
dataframe['tv_watching_time']= dataframe['tv_watching_time'].str.replace("H", '*60').str.replace(' ','+').str.replace('M','*1')

The above expression will generate the result in the data frame. I showed the below data frame just for viewing purposes. You can directly apply the eval () function above.

Replace Transformation

Now, apply the eval() function and perform the addition of two-column.

#Apply eval() in 'books_reading_time' column.
dataframe['books_reading_time']= dataframe['books_reading_time'].apply(eval)
#Apply eval() in 'tv_watching_time' column.
dataframe['tv_watching_time']= dataframe['tv_watching_time'].apply(eval)
#Add books_reading_time + tv_watching_time columnsdataframe['total_time'] = dataframe['books_reading_time'] + dataframe['tv_watching_time']
dataframe

The final result,

Total time

As you see, eval() is really a powerful function and allows us to perform calculations even if your data type is a string.

--

--