5 Tactics to teach a novice about Machine Learning!

Rajendra Mishra
7 min readNov 18, 2016

--

How to teach a novice the most complex thing in the world “Machine Learning”. If you too fear from Machine Learning, do not worry. Complexity is created by the people who have invented the simplicity of that new field so as to increase the barrier to entry in their field. The people or more specifically the organizations who have identified this idea are competing with the pioneers of the field.

So, I am here to answer the most basic but the toughest question of Mankind. How to teach a novice about Machine Learning. Well I was in teaching for almost two years. I have handled all these scenarios when I was active in the field of tutoring. This post may be slow paced as well as pretty basic as my target audience is the novice. I am going to use all my tactics and experience of teaching in this post so as to make the most complex thing on the earth, the most simple one.

Tactic 1 :

Let’s flush each and everything that we know. Start fresh. Develop Insights.

Tactic 2 :

Use examples. The best possible example to fit each and every case that you want to present.

Tactic 3 :

“Technical jargons”. Avoid them initially. These are the terms that I mentioned as complexity created by the inventors. :D

Tactic 4 :

Create an environment where everyone including the tutor are on the same page. No one knows anything.

Tactic 5:

Now, while driving your lecture, make your audience think. Let them visualize. Help them in developing perception.

Mentioned above are the tactics that I have used successfully to engage with my students, create a friendly environment and make them understand things better. If you understood the tactics that I mentioned above, you will realize them in the post that will follow.

OK! So Let’s start. Let’s imagine that their is a kid who doesn’t have any idea about fruits. But he is aware of a farm where there are many apple and orange trees. You are a lazy lad so you are asking that kid to go to the farm and bring apples. The kid not being aware of how an apple may look like, what are the possibilities that strike in your mind so that you get the apple from the farm?

Guess??

Ok let’s point out the possibilities:

1) Instead of that kid you go to the farm and bring the apples.

2) Give the boy a checklist of features so that he can identify what an apple may look like and bring the apple.

Scenario 1 : Not possible! I told, you are lazy right!

Scenario 2 : Yes! Can be done. Feasible as well.

So let’s consider scenario 2. You tell the kid that there are two kinds of fruits in the farm, Apple and Orange. I am giving you a checklist in which I have mentioned some features that a fruit may have. The checklist is:

A simple representation of features

As shown in the table above you handed over the checklist to the kid and asked him to go to the farm and bring apple. Now what do you think the kid will do? He will go to the farm. He will verify the details of the fruits in the farm with the checklist that you provided. He will try to find out the best possible fruit whose attributes match the description given in the checklist.

OK now if you have followed the example correctly just list the challenges that the kid may face?

Guess?

OK here are few:

1) The description of color is Dark Red in case of apple. It is a general color. But it may happen that the apple in the farm may not be Dark in complexion. And even there may be cases where the apple has almost a similar color as orange, but rare.

2) Both grow on tree! What! a lot of confusion right.

3) Seeds — Oh that’s great. Can identify pretty well. But what if the fruits in the farm do not have seeds, again rare case.

4) Texture — Oh that’s good attribute.

5) Both the fruits may be in same shape and size

6) And even the identification of all these attributes depend on the kid’s understanding

Oh guess what? We have covered the machine learning and the challenges in Machine Learning by this example. Machine Learning model is like that kid. We provide the model with the kind of data that we see in the table above. Finally the model tries to find the best possible match based on the attributes that it identified in a fruit and the checklist that you have provided to it.

Above mentioned case represents Supervised Machine Learning.

So let’s relate this stuff to Machine Learning. The kid relates to the specific model in Machine Learning. The Attribute that we have defined in the table above to create the checklist is represented as a vector generally referred as Attribute vector or Feature vector. There may be any number of attributes that we may need for a specific task. So say if we have some ‘d’ number of features or attributes to represent our knowledge about fruit as in the example above then we are referring a vector in d-dimensional space. And this will be the case with all the fruits that we may want to identify with the likes of apple or orange or grape or pear. Each one of them will be represented as a vector in d-dimensional space. Now, the space that these vectors belong to is called Feature space.

OK so enough technicality. Let me ask you a question. Let us consider that your are lazy enough to provide the checklist of attributes as well. Now you do not want to provide the kid, the checklist as well. How are you going to handle the scene?

Think!

Ok let me give you a possible scenario. You have some sample fruits in your home. Instead of preparing the checklist, handover the sample fruits to the kid and tell him to identify the attributes. So leave the kid for five to six hours with sample fruits. Now once the kid has spent enough time with the samples, tell him out of those samples, which all are apples and which are oranges. Now the kid will have some understanding of the attributes that he shall find as to identify apple and orange.

Can you identify the possible challenges the kid may face?

1) Whether the kid brings apples or oranges totally depends on his understanding, which he developed from the sample fruits that were given to him

2) The possibility is there that the sample fruits that were given to him were not the best representatives of General Fruit class

3) The kid may have some biases towards some of the fruits which may hamper his choice of attributes selection

Oh wait! You know what we have covered two important methods of Machine Learning. Supervised machine learning and Unsupervised machine learning. Let’s relate to the example. Providing the kid with only sample fruits for identifying the attributes or features of Fruit class is actually unsupervised machine learning. You are just provided with the sample data, and from that data you need to identify the generalized attributes. So Unsupervised Learning is mainly used to learn Feature representation. Unsupervised Learning has many applications in Clustering, Topic Modeling as well. The next phase where after five to six hours you tell the kid that what samples in the sample fruits given to him are actually apples and oranges. This is the case of Supervised Machine Learning. So in this case we used a Hybrid approach. The kid learned the features on his own (Unsupervised) and then we provided him with respective class labels i.e. apples or oranges (Supervised).

Generally what people do once they have been given the feature vectors as specified in section 1 or the attributes that their model has identified using Unsupervised learning as described in section 2. So if I say the feature vectors that we have are points in d-dimensional space, do you agree with that? Are you able to visualize that? Ok let me help you in identifying that.

Let’s say for some task we have just 1 feature. In that case we will have 1 dimensional vector. The feature space in this case will be 1-Dimensional space like number line. So 1 dimensional feature vector in this feature space is a 1-dimensional point. In other words the position of the point in the feature space will be relative to just one axes. Similarly in 2-dimensional feature space the 2-dimensional vector will be a point whose position in the plane will depend on two axes X and Y. This way in d-dimensional space, d-dimensional feature vector is a point whose position in the plane depends on d axes.

Now I will take the example of Classification task in Machine Learning. In Classification, people usually try to fit a line or curve in the d-dimensional space so that various data-points in that space can be separated in their respective classes. If you want a linear classification i.e. want discrete value either 0 or 1, apply some linear classifiers. If you want continuous value i.e. some value between 0 and 1 and based on some threshold you want to decide the class, use some non-linear classifiers like sigmoid or softmax.

Disclaimer : This is a personal web blog. The opinions expressed here represent my own and not those of my employer.

--

--