Decision Trees — Day 10 #100DaysOfMLCode

SciJoy
SciJoy
Sep 2, 2018 · 2 min read
“landscape photography of splitted road surrounded with trees” by Oliver Roos on Unsplash

Saving Data

You have a phone with a limited data plan that only sends messages in binary. Every night you want to tell your friend where to meet you, which could be your house, the park, the movies, or your favorite restaurant. You could send 00 for your house, 01 for the park, 10 for the restaurant, or 11 for the movies. This is a waste of your data plan.

You see movies only 10% of the times you go out. There is a 40% chance you will meet at your house, 20% chance you will go to the park, and a 30% chance of going out to eat. Since you are most likely to go to your house, this can be sent as a 0. If you aren’t staying in, the next decision to make is whether or not you will get food. If yes, then you can send a 10. If not, you need to pick if you are going to the park. A yes is a 110 and a no is a 111 (to the movies!).

Most of the time, you will only be sending a 0, one bit instead of the original two for 00. Rarely will you send the three bit 111 for the movies. Over time this will day you data.

Making Choices

This graph is not linearly separable, but we can make some decisions lines that will isolate the green points from the red. The goal is to get the most information out of every decision.

We could place the first split at y=1. This will ensure that everything below 1 is green. We will have to make additional decisions about the points above that line. We could also use y =6. This split would tell us that everything about y = 6 is red. There are more data points above y=6, which means this is the better first split. Then we can split at x = 4. If a point is less than y=6 and less then x = 4, then it is green. If it is not, then we have another choice to make. Our last split can be at x=1. Now we have a decisions tree for splitting red and green data.

Udacity — Intro to Machine Learning, Sebatian and Katie

Coursera — Machine Learning, Andrew Ng

Machine Learning, Stephen Marsland (2015)

Claude Shannon — Father of the Information Age

[https://www.youtube.com/watch?v=z2Whj_nL-x8&t=203s]

Information entropy | Journey into information theory | Computer Science | Khan Academy

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade