A Wittgensteinian Approach to Data Science

Planetarium from 1766, Photo by Sage Ross, Creative Commons

The term model gets thrown around a lot. The word is ubiquitous to the point of lost meaning. The Wikipedia page alone shows the variety of usage of the word model, including statistics, astronomy, biology, product design, art, as well as conceptual models.

The etymology of model is interesting as well, stemming through French and Italian back to the Latin modus, for ‘measure, rhythm, or way’.

Nevertheless, the definition for ‘conceptual model’ captures the broadest interpretation of the word in any sense, as always from Wikipedia:

A conceptual model is a representation of a system, made of the composition of…

A Comprehensive Guide to Learn HTML/CSS As You Go

Fear of HTML is the Mind-Killer

Apart from modeling, data scientists spend a lot of time writing. They write to communicate insights, they blog, and they tweet. The number of topics in the domain of data science is exceedingly vast. From NLP to facial recognition, from predicting customer churn to detecting rare deep space events, the bounds of focus are without limit. However, most data science experience involves Python and data collection: rarely do data scientists deal with HTML, CSS, and JavaScript, the so-called web dev languages. Ironically, this lack of design experience limits the reach of data scientists who might otherwise display their projects in…

With Visualizations and Simulated Data

Two Apples on a Table by Paul Cézanne, Public Domain

A/B testing is a crucial data science skill. It’s often used to test the effectiveness of Website A vs. Website B or Drug A vs. Drug B, or any two variations on one idea with the same primary motivation, whether it’s sales, drug efficacy, or customer retention. It’s one of those statistical concepts prone to an extra layer of confusion, because hypothesis testing alone requires understanding the Normal distribution, z-scores, p-values and careful framing of the null hypothesis. With A/B testing, we have two samples to deal with. However, A/B testing is still just hypothesis testing at heart! …

How Chess Can Improve Your Data Science Skills

Photo by George Hodan, Public Domain

Chess and data science have a lot in common. Some seemingly surface-level parallels include imposter syndrome and a feeling of powerlessness in the face of overwhelming complexity and indecision, all on top of a time crunch.

If we look closer, though, these upfront similarities belie truly deeper parallels between fields. …

And the Importance of Dependency Management

Most serious data science projects should take place in a Docker container or a virtual environment. Whether for testing or dependency management, it’s just good practice, and containerizing gives you greater power for debugging and understanding how things work together in a unified scope. This post is about creating a virtual environment in Python 3, and while the documentation is seemingly straightforward, there are few major points of clarity I’d like to point out, especially when using Jupyter notebooks and managing kernels and dependencies.

Photo by George Hodan, Public Domain

What is venv?

The Python 3 module ‘venv’ comes with Python 3’s standard library. This means it’s built in…

And Why it’s Not Always a Correction

A standard deviation seems like a simple enough concept. It’s a measure of dispersion of data, and is the root of the summed differences between the mean and its data points, divided by the number of data points…minus one to correct for bias.

This is, I believe, the most oversimplified and maddening concept for any learner, and the intent of this post is to provide a clear and intuitive explanation for Bessel’s Correction, or n-1.

Heliometer for Measuring Stellar Parallax, First Achieved by Friedrich Wilhelm Bessel, Public Domain

To start, recall the formula for a population mean:

From LibreTexts

Sometimes the simplest refreshers are best, and when it comes to statistics, concepts like parameter, statistic, z-score, t-test, Student’s t-distribution, standard deviation, Chebyshev’s rule, and confidence interval can tend to merge into disorienting word salads, much like this sentence itself.

This purpose of this post is to provide a quick refresher on these basic concepts for others (myself) when I inevitably forget how exactly to interpret a confidence interval by remembering to never say there is a 95% chance a specific interval contains the true mean, but to instead say Oh god, when will I remember to just keep my…

A Beacon of Comfort to the Weary Aspiring Data Scientist

Just for starters. Stack Overflow comes later.

Getting started in data science, or any sub-field of tech in general, is enormously frustrating. The barrier to entry is often sufficient as a gatekeeper in itself, and by the barrier to entry, I mean:

  • having access to a computer with sufficient speed and power on which to effectively learn
  • non-existent conceptual awareness of things like terminal, bash, unix, IDE, ipython notebooks, kernels, dependencies, libraries, frameworks, packages, imports, scripts, editors, shells, virtual environments, Docker containers, and that only touches the surface of the arcane vocabulary of the general abstractions of…

This is part two of a series ‘Learning React as a Data Scientist’, where I document my attempt to learn React, as a data scientist. Part One is here.

A Smörgåsbord of Python and Javascript (source)

My previous post on this subject was a relatively abstract overview of React. In that post, I said I’d look forward to reading it again weeks later to see what I had gotten right, as well as what misconceptions and biases had been lingering from my experience with Python and Jupyter notebooks.

Well, that’s what this post is about.

Most React tutorials begin with an immediate disclaimer that JavaScript is a…

Communication, Terror Management Theory, and Accompanying Thoughts on Media and Integrity

Wow, thanks for clicking. Genuinely. I’ll try to make it worth your time.

A Variety of Hooks (link)

I’m here to talk about Clickbait. More specifically, I’m here to talk about what Clickbait represents, and the potential for its long term mutation.

If you read blogs on Medium regularly, or even take a glance at its front page, it’s everywhere. If you use Facebook or Instagram, it dominates stories and links with the primal momentum of any self-replicating biological, and nihilistic, force. Really, if you use the Internet, you can’t avoid Clickbait.

The Internet is, after all, a collection of networks, of internets. Many of…

Brayton Hall

Data Science | Philosophy

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store