Multiplication Using Machine Learning
The use of ML model is increasing day by day in different-different industries. Today we hear ML and AI everywhere and I was also intrigued about this field a lot and started learning and working around ML algorithms around a year ago.
One day, I got intrigued that how these models will fare on task like multiplication. So I created a data-set to test the performance of different models on multiplication.
Data Preparation
#creating dataframe
df = pd.DataFrame()
df['first_digit'] = [x for x in range(1,1001)]
df['pk'] = 1#crossjoin dataframe
df2 = df.merge(df,on='pk',how='outer').drop('pk',axis=1)#renaming columns
df2.rename(columns={'first_digit_x':'first_digit','first_digit_y':'second_digit'},inplace=True)#Creating target column (first*second)
df2['product'] = df2["first_digit"]*df2["second_digit"]

Train-Test Split of Data-set
X = df2.drop('product',axis=1)
y = df2['product']#80-20 split of data-set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Training Models
Trained linear regression, RandomForest, LGBM and XGBOOST on the data set on default parameters. I used 800,000 data points for training and 200,000 data points for testing.
%%time
#linear-regression
lr_time_start = time.time()
lr = LinearRegression()
lr.fit(X_train,y_train)
lr_time_stop = time.time()
print(f"LR Model Trained: {lr_time_stop-lr_time_start}")#random-forest
rf_time_start = time.time()
rf = RandomForestRegressor()
rf.fit(X_train,y_train)
rf_time_stop = time.time()
print(f"RF Model Trained: {rf_time_stop-rf_time_start}")#lgbm
lgbm_time_start = time.time()
lgr = LGBMRegressor()
lgr.fit(X_train,y_train)
lgbm_time_stop = time.time()
print(f"LGBM Model Trained: {lgbm_time_stop-lgbm_time_start}")#xgboost
xgb_time_start = time.time()
xgb = XGBRegressor()
xgb.fit(X_train,y_train)
xgb_time_stop = time.time()
print(f"XGB Model Trained: {xgb_time_stop-xgb_time_start}")
Result
So now lets measure the output of these models on RMSE and MAPE.

RandomForest gave the best performance among the four models. Lets have a look at the output for 10 records.
(lr- Linear Regression, rf- RandomForest, lgbm- Light Gradient Boosting Model, xgb- XGBoost)

As we can clearly see, random forest was able to give very close output. LGBM and XGBoost are little off but linear regression was not able to capture the multiplication logic well.
You can view the notebook here.