Elements of a Machine Learning Model

From data to model

Parijat Bhatt

Published in

Analytics Vidhya

3 min readDec 25, 2018

Data

Since Machine Learning is about making predictions using data, the size(number of instances) of data and its source can be very deterministic in building a model for achieving the objective. A super noisy data can be difficult to train for even the best algorithms and may even lead to over-fitting as the model may generalize over the noise. If the data size is too small, the model’s performance may be more biased to the training set. Another important factor would be the number of dimensions. Often one would require techniques like PCA/non-linear compressed representation for reducing the number of dimensions in the data.

2. Objective or Problem you are trying to solve

It is important to know what exactly we are trying to achieve by building our Machine Learning model. Do we have a supervised learning problem or we are simply trying to find a pattern in data(unsupervised learning) or do we have a robot whose actions we are trying to optimize(reinforcement learning) It’s also required to know whether we have a classification or regression task at our hand. Knowing the problem is pertinent to the selection of a better model.

3. The Algorithm required to solve the problem

The algorithms required to solve the problem would depend upon a) structure in data b) whether it’s a classification task or regression task. For data with linearly separable boundaries, SVM with a kernel could be used and for non-linearly separable on the other hand one could come up with CART(Decision Tree) for classification or neural network generating scores for each class. A lot of algorithms that are used for classification may also be used for regression and vice-versa. The majority of the tasks that we come across are classification related.

4. Loss function and Optimization method

Finally, when we know what model would be a good fit for the data, we need to choose a loss function that reflects our objective and an optimization algorithm that we can use for achieving the objective, given the time and hardware constraints. The optimization algorithm could be a simple Gradient Descent , Stochastic Gradient Descent or mini-batch Gradient Descent, RMSprop, Adam optimization etc. We need to choose the one that’s simple enough and would at the same time provide good results.

5. Suitable Regularization

Many times an additional term is introduced in the vanilla loss function to get better results. This additional term is a regularization term. According to Wikipedia,

In mathematics, statistics, and computer science, particularly in machine learning and inverse problems, regularization is the process of adding information in order to solve an ill-posed problem or to prevent overfitting.

As per the objective and the model, one needs to find a suitable regularizer as different regularizers would have different impacts on the optimization. The regularizer can be L1, L2, Frobenius norm(matrices), the norm of the gradient of weights(GAN), KL-divergence, etc.

6. Hyper-parameters used

The most important aspect of training may be choosing the best set of hyper-parameters. Hyper-parameters are just variables whose value can vary and the correct value for the most optimal model is selected by training the model on a part of the dataset(Holdout Validation) or using cross-validation. Different hyper-parameters produce different effects on being increased or decreased and it’s always good to have an idea of how a hyper-parameter would affect the performance on training set as well as test-set.

7. Evaluation of a model

Different hyper-parameters and different algorithms would result in different models. Thus, it is required to choose among them what would best serve our purpose. How would we do that? By evaluating how correctly our model performs on a part of data set generally left aside particularly for this purpose.

Contact

For any suggestions or typos please send an email to bhattpa@oregonstate.edu. Connect with me on LinkedIn or follow me on medium for more of my upcoming posts.

Elements of a Machine Learning Model

From data to model

Contact

Written by Parijat Bhatt