Visualizing some of the intuition behind a statistical technique used to trace a line through data points

If I asked you to draw a smooth line between a bunch of points, you could probably do a pretty good job. And it’s also something journalists do all the time using computer graphics to illustrate trends in their data. But how do we get from that first example to the second — how does a computer replicate the intuition we exercise when tracing a line?

One such method is called kernel regression (or more specifically, Nadaraya-Watson kernel regression), which estimates a dependent variable y for an input x with the following equation:


Illustration by Julia Harrison

Measure a neighborhood not by its per-capita income or violent crime rate, but by the width and health of its trees.

I was fortunate enough to grow up in Park Slope, Brooklyn, a neighborhood so conducive to child-rearing that most of the local humor revolves around expensive strollers, organic produce or other trappings of an overachieving parent. In retrospect, excursions to other parts of the borough — along Flatbush Avenue for a Saturday doubleheader, into the downtown business district for yearly math competitions, or way out to old-school Bay Ridge to see my girlfriend — probably had outsized value as recesses from the wealth and whiteness of Park Slope.

Most middle schoolers aren’t familiar with per-capita incomes, violent crime incidence, infectious…


How to obtain and analyze your history of Google searches, using myself as an example.

Google’s search engine is so thoroughly baked into our everyday existence that it feels more like the final stage in a cognitive process than it does an independent piece of software. Modern humans don’t wonder, they wonder-then-Google, with the taps of characters into your address bar as natural and legitimate a step as the original thought.

As a result, your accumulation of Google searches over a period of time acts as a reliable proxy for your state of mind, curiosities, ambitions, and fears included. Luckily (or not, depending on your definition of privacy), Google logs your searches and makes them…


Illustration by Julia Harrison

To scope a disaster, do you need an entire list of names or just a single popular one?

The most prominent monuments in Battery Park, the military fortification turned public space on the southern tip of Manhattan, belong to the East Coast Memorial, which commemorates the U.S. servicemen who died at sea in the Atlantic Ocean during World War II. Like other memorials that attempt to honor massive casualties in an unostentatious manner, it is a simple arrangement of eight stone panels engraved with names.


Every Citi Bike route is favored by a certain demographic, from young men to elderly women.

For most Citi Bike riders, the click made by a securely docked bike is a welcome indication that they are no longer responsible for that bicycle or the heavy fees that come with loss or damage. But for anyone interested in analyzing Citi Bike’s macro trends, that click means the ride is officially over and logged, and the information is being beamed to some central repository.

In total, there were nearly 14 million Citi Bike rides in 2016, a figure larger than the previous year’s 10 million but probably smaller than wherever this year’s tally will end up as the…


Illustration by Jiin Choi

Waiting for a bus is often characterized as a unique probabilistic scenario. Does it hold up when real data is examined?

In the realm of probability theory, there’s a spooky concept called memorylessness that characterizes certain scenarios. Anytime the chances associated with an outcome are unchanged as time or trials go by, your situation can be described as memoryless. In other words, whatever has happened does not affect what will — the past is always “forgotten.”

This idea is a lot easier to understand with an example, and for that we’ll call upon probability’s poster child: flipping coins. If you flip a coin a thousand times, the probability of getting heads on any one flip is 50%, irrespective of what has…


Illustration by Maria Finocchiaro

How a little bit of statistics and game theory can justify making a radical change to your appearance.

In the wee hours of January 1, 2017, I set my beard trimmer to zero millimeters and shaved off all the hair on my head. In the days that followed, when concerned friends asked why I had buzzed my blonde locks, I had a few answers ready: I’d always wanted to try it out, it’d grow back in a month or two, plus it was an appropriate way to physically mirror the freshness of a new year. I kept it to myself that I’d also been a little drunk.

It wasn’t a great look. At a time when Western civilization…


Illustration by Jake Goldwasser

How Trump’s election and policies have challenged statistics’ status as an objective science.

On November 23rd, 2015, in response to Donald Trump’s modest but increasing odds to secure the Republican Party’s nomination, FiveThirtyEight’s founder and prognosticator-in-chief Nate Silver tweeted out a reassurance:

It was one of dozens of dismissive tweets that Silver had to account for in an autopsy of his website’s erroneous predictions after Trump indeed secured the Republican nomination. When Trump pulled off a shocking upset in the general election, Silver again had to explain his misfire, even though FiveThirtyEight’s model was far more accurate than the many high-profile forecasts that all but assured a victory for Hillary Clinton.

In…


Do the usual guiding principles about domain name value apply in the new .nyc universe?

Remember when web addresses were simple? When .com was so prevalent that you could name entire financial booms and busts after it? Now it seems like there’s a new trending suffix every week. Startups favor .io, orginally assigned to the Indian Ocean area, since i/o is a common abbreviation in IT for input/output. Some popular and clever blogs use .city in their titles.

Technically these address bar suffices are what’s known as “top-level domains,” or TLDs, and in the Internet’s early days, only the ones you’re most familiar with were available, such as .com or .net. But in 2014, ICANN…


Illustration by Jiin Choi

Running text analysis on hundreds of homeless people’s signs can synthesize their collective message.

New Yorkers will recognize two general forms of supplication employed by the local homeless population. One strategy is to verbalize your plight, either at the corner of an intersection or on a subway car between stops, which can draw both pocket change and ire from surrounding travelers. The other is to hold up a cardboard sign upon which your message is scrawled.

Cardboard signs are advantageous for the same reasons that text usually outperforms spoken word: it’s faster, more scalable, and allows an audience to opt out. Opting out, unfortunately, is what most of us do when passing a homeless…

Walker Harrison

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store