272 Followers
·
Follow

Whenever I find a topic I can’t find a sufficiently good tutorial or explanation of online, I feel compelled to offer one. I hope this helps you.

I. Understanding the Hypergeometric Distribution

The hypergeometric distribution describes the probability of events in the following scenario:

Suppose you have a jar containing 10 red marbles and 90 black marbles.
You collect 10 marbles from the jar.
What is the probability you collect k red marbles?

Collecting a single red marble seems intuitively most likely, but if you collected none or a couple, that wouldn’t be too surprising. …


Logistic Regression

This article is about different ways of regularizing regressions. In the context of classification, we might use logistic regression but these ideas apply just as well to any kind of regression or GLM.

With binary logistic regression, the goal is to find a way to separate your two classes. There are a number of ways of visualizing this.

Image for post
Image for post

No matter which of these you choose to think of, we can agree logistic regression defines a decision rule

h(x|theta) = sigmoid(x dot theta + b)

and seeks a theta which minimizes some objective function, usually

loss(theta)= ∑ y*log(h(x|theta)) + (1−y)log(1−h(x|theta))

which is obfuscated by a couple clever tricks. It is derived from the intuitive objective…


“ ‘All models are wrong, but some are useful.’

So proclaimed statistician George Box 30 years ago, and he was right. But what choice did we have? Only models, from cosmological equations to theories of human behavior, seemed to be able to consistently, if imperfectly, explain the world around us. Until now. Today companies like Google, which have grown up in an era of massively abundant data, don’t have to settle for wrong models. Indeed, they don’t have to settle for models at all.”

So proclaimed WIRED editor-in-chief Chris Anderson 7 years ago, opening the July 2008 issue of stories relating to the advent of “The Petabyte Age” with his piece entitled: “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete”. …

About

Alex Lenail

conscious mammalian organism, fanatical tea snob.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store