ai like im 5: models, parameters, and choosing the right model (article 8)

ai like im 5

8 min readJan 7, 2024

before i start, a couple things:

out of the loop? — https://medium.com/@ailikeim5/list/ai-like-im-5-in-order-87ef4064afe8
this content is not aimed at 5 year olds, but instead simple.

prerequisite knowledge:

a. basic understanding of data

b. good understanding of ai and machine learning

c. basic understanding of some data assumptions

d. good understanding of training and validating!

ai is like a secret handshake, you can get easily confused if you are out of the loop! these are some of the hardest concepts alive and i wish i could skip steps, but please read my articles on these topics if you don’t understand!!!

anyways,

today we are going to talk more about models and take a deeper dive into them.

i’ve briefly explained models before as

special algorithms that take our data and provide insights into the data
and that machine learning models are designed to identify patterns and relationships within data and learn from them!

and this could be

predicting the weather based on the weather history
deciding whether the image is a cat or a dog
recognizing the title of a song based on humming

but before we dive into specific models, let’s learn some more!

models

a model is the brain of our artificial intelligence!
it is a mathematical representation of a real world process
we can conceptualize them as functions!

here is a illustration of the function behind supervised learning

this model learns by identifying patterns and relationships in the data and predicting an output

recall from the last article the process of training and validation:

models will make predictions and learn from them (training)

and our ultimate goal is so that it can perform well in validation!

this is the key to performing well in real life situations!

but how does it adjust itself in the training:

parameters

parameters are the variables of our functions or models
meaning these are the properties of the data will be adjusted during our training process
but they determine the behavior and predictions made by the model in both training and validation

if machine learning is like making a cake:

data is the ingredients of the cake
models are the recipe

you can think of parameters as the individual steps and execution of that recipe

good steps and execution = good cake
bad steps and execution= bad cake

for ml, even if all other factors are perfect:

good parameters = good model
bad parameters = bad model

so if we can find

quality data
an optimal model
and the optimal parameters of that model

we can solve many real life problems

but there’s a catch:

sometimes finding quality data is hard
sometimes finding the optimal model is hard
and even after we have the optimal data and model, finding the optimal parameters for that model can be hard and time consuming.

picking the optimal model

the best comparison for picking the optimal models is golf

in golf, a player can have many different clubs

each club is designed for a specific type of shot:

a driver is designed for the really long ones
a putter is designed for the really short ones
and we have a irons, woods, wedges, and chippers
- these provide greater control and distance for everything in between

here is a visualization of this from one of my favorite driving ranges!

the choice of club depends on the situation, personal preference, and more!

there are many different factors that go in the game but most people and almost all pro golfers play the same way, meaning they all exhibit similar playing style!

but there is not an exact science to it:

let me tell you a story about happy gilmore!

happy is a man that dreams of being a pro ice hockey player

but there just one problem: he’s a really bad ice skater.

and one day, he discovers he might be good at something else

so happy becomes a professional golfer…. but happy’s hockey persona doesn’t exactly fit in:

the way he drives the ball
the way he speaks and acts
he has some anger issues

and the fact that he uses a hockey stick instead of a putter

but happy ends up being a successful pro golfer and ends up winning the golf championship.

the takeaway:

golf is not an exact science: there are many different techniques, environmental factors, and things that go into it.
you can spend your whole life becoming a better golfer, and then a random hockey player who uses a hockey stick as a putter becomes the best golfer in the world
because at the end of the day, there is one thing that matters in the sport of golf: who can hit the ball in the hole the best!

connection to machine learning:

machine learning, data science, and deep learning are not an exact science
there is no universal handbook telling you to use this exact model for this problem
but our ultimate goal is to solve the problem to the best of our abilities!

we can best accomplish this by having context and intuition into

the problem

understand the problem or task, what are you trying to accomplish!
find experiences with similar or the same problems
-> the internet is a wonderful place

do not reinvent the wheel!

there’s a reason almost all pro golfers play the same way, it performs well at the highest level and it has been played that way for many years!
-> there’s no need to use a hockey stick as a putter!
but if you have more experience with a hockey stick as a putter and you are winning the masters, go ahead happy!
if you have more experience with that model, and that model provides you better performance than the norm, go ahead!

2. the relationships in the data and the data

just because your problem and data are unique (no one has ever solved it before)
-> does not mean the relationships and patterns in the data are unique
there are models for all different types of relationships, types of data, and more!

3. research! research! research!

if you are working with computer vision(pictures, videos, etc), before you even pick a model, you should have an in depth understanding of vision data, your task, and what computer vision models really do .
there are thousands of computer vision models and it is impossible to pick one without knowing this!
because ai is not an exact science, the field is constantly changing
-> the best computer vision model in 2015… is not the best computer vision model in 2023!

and one more thing… experimentation:

trial and error is the key

if we have a way to measure the performance of a bunch of different models that fit our situation
-> we can determine what the best one is
i will explain more about how to measure of this in the next article!

note:

experimentation is not only about finding out something works
it also about why certain things act or behave a certain way
experimentation should not be random!
there should be a reason and methodology behind everything in ai and data science

there is no universally right or wrong model but there are very good, good, bad and very bad models, depending on the situation!

every model has tradeoffs:

note before you read: these tradeoffs are not the ground and absolute truth

these are general guidelines that help aid model selection and help us understand model behavior
they are true in the vast majority of cases and have deep roots in years of statistics, ai, and machine learning
but there are exceptions sometimes and sometimes a certain dataset, problem or real world application will defy these tradeoffs

complexity and interpretability
-> as things get more complex and better, the behavior becomes harder to understand and recreate

we are having a hard time understanding some of the behaviors of the most complex ai’s: they make decisions and actions we don’t get
-> we call these emergent behaviors

2. underfitting and overfitting
-> because machine learning models learn patterns, as things get more complex, they learn irrelevant ones called noise

note: underfitting and overfitting are not simple, please read the article if do not understand them! this is still an important and funny visualization though

we are really only interested in the behavior of models to new, unseen data and the real world!

our goal is the find the just right fit!

3. performance and optimal resources
-> as thing become complex, they require more data and computational resources!

out there, there is the most knowledgable self-driving ai, but if it only make 1000 decisions a second with a super computer, it is not practical

4. generalization and specialization
-> as models become better at one task, they become worse at doing multiple things

most ai’s focus on 1 thing and doing it well!

there are more tradeoffs but these are the most important in my opinion.

again, these are general assumptions and they help us understand model behavior, they are not always true

there are models in the world right now that are challenging these assumptions and defying what we thought was possible

right now, there is a heavy push towards breaking through the generalization and specialization trade off, and this is called artificial general intelligence!

picking the perfect golf club can be hard

picking an amazing machine learning model is hard… it takes time!

so anyways,

in the next article, i will dive into last stage of finding the optimal model, and how to evaluate the performance of a model! i wanted to cover all of this in one article, but two will do.

i think this is the best article i’ve wrote so far!!!

this series made me realize i love highly personal content and i think i am going to try and take things that direction, as much as a can.

if that means making random connections like happy gilmore!
telling stories about me sports betting on korean baseball
including inside jokes with me and my friends (not buenas)

and more

i want you to enjoy reading my articles, to laugh and maybe cry sometimes because human emotion is important, emotion will (hopefully) never exist in machines and no large language model could ever make a original connection like that!!!

my human moment of the day has to go a very special man, lebron james!!!