Speed vs Accuracy in predictive modeling

Dealing with Categorical Variables in Machine Learning

Real-world data issues and machine learning model

Sarit Maitra
The Startup
Published in
11 min readJun 10, 2020

--

Image by author

https://sarit-maitra.medium.com/membership

A critical step in predictive modeling is the choice of specific learning algorithm. Well, this could be an iterative process which by means of trying different algorithms against a naive/base accuracy score; but an information on right algorithms always helps in business to use on a specific use case. Once preliminary testing is judged to be satisfactory, the classifier is available for routine use. The classifier’s evaluation is often based on prediction accuracy.

Here is a real world issue; real world data often involve discrete variables (e.g., categorical variables). From an analytical perspective, these variables determine the definition of the objective and constraint functions, as well as the number and type of parameters that characterize the problem. Moreover, the inherent discrete and potentially non-numerical nature of these variables posses several challenges when applying machine learning algorithm. Although categorical and dimensional variables often lack a conceptual numerical representation, it is common practice to assign a real or integer value to every considered…

--

--