Image for post
Image for post

Here’s a challenge: define artificial intelligence and list some examples of AI technology. Then ask your coworker, a friend, or a stranger to do the same.

I’ll bet your definitions and examples of AI don’t match. They may be closely related, but it’s pretty typical for people to have different takes on what does and doesn’t count as AI.

Trying to define artificial intelligence

As more AI-based and AI-laced technology is developed every year — now to the point where it’s everywhere you look — people argue over what “counts” as artificial intelligence.

The simplest definitions say that AI is engineering that enables machines to…


“How could this require an article?” you may ask.

“To delete files, you delete them, then they aren’t there anymore!”

Let’s try that — actually, let’s let someone on StackOverflow try that. We pick up with an image, tester/mytestfedora, that has a file in it to be deleted. The file, Riverbed.zip, is 25MB.

$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
tester/mytestfedora latest f122b12e94a3 5 seconds ago 542.5 MB

Ok, cool, let’s get in there, delete this file, then commit that container into a new image.

$ docker run -i -t tester/mytestfedora /bin/bash [dockeres@fb5ba36692f0 /]$ cd /home/dockeres/downloads/ [dockeres@fb5ba36692f0…

Use BERT for smart string interpolation without deep learning experience

tl;dr Qordoba is open sourcing FitBERT, a library to make it easy for anyone who knows Python to use BERT (or other fancy deep learning NLP models) for string interpolation given a list of options.

If you follow NLP news, even peripherally, you have probably heard of BERT, Google’s very large masked language model that shattered benchmarks when it came out in October 2018. If you’ve used BERT, you probably used it with the fantastic HuggingFace library, Transformers. …


Or “How I Learned Just Enough About Unicode Implementations To Solve a Bug”

Image for post
Image for post
Photo by Fausto García on Unsplash

Last week, I was investigating a bug, and in the process learned quite a bit about Unicode. After identifying the source of the bug, I found more instances of it in the wild. I’m writing this piece to pass on what I learned. The bug shows up when you:

  • Have a Python 3 backend…
  • Are processing strings which mix normal text and emoji…
  • Are Identifying spans of text based on string indexing (for example, using spaCy)…
  • And passing these indexes to a JavaScript front end

To make this tangible, here is an example of this bug I found in the…


[NOTE, added 2020–01–02: This article is written in a jocular tone. It jokingly makes fun of PHP, because of a config change that Valet/Laravel used to make to your DNS. This made some PHP defenders crawl out of the woodwork to be mean to me. I considered taking this article down, because I don’t like conflict, but so many people have found it useful, that doesn’t seem fair.

So instead, a plea: please, PHP fanboys, if PHP jokes hurt your feelings too much; don’t read this article! If you do, keep in mind, I am making a joke. …


Interesting things are happening very quickly in the field of Natural Language Processing. To help me process them, and to try to be of use to the community, I will try to summarize them here. The target audience for this post is machine learning researchers and practitioners with some familiarity of NLP. Most of the work I reference is from 2017–18, so I think “recent” is a fair characterization.

This article will focus on two topics: fine-tunable language models and question answering systems. There is much more going on in NLP, but these are the fields that I am able…


Though my background is in math and physics, I hadn’t used statistics in earnest for at least a decade when I started putting effort into understanding machine learning. One basic statistical concept that I did not have prior experience with was the bias vs variance tradeoff. It was referenced many times before I decided I really needed to not just nod and pretend to get the reference. I am writing this piece in an attempt to further and share my understanding of this topic.

In “Homo Heuristicus: Why Biased Minds Make Better Inferences,” the authors discuss heuristics — efficient cognitive…


Changing the dominant computing metaphor

N.B. I wrote this in early 2016. I got writer’s block and couldn’t finish it. I keep wanting to reference it, though, so I am just going to ship it! I hope you get something out of it despite how unrefined it is.

Image for post
Image for post
We should have stopped here.

In the 90s, Neal Stephenson wrote an essay [updated version] called “In the Beginning… was the Command Line” which was a commentary on why “the proprietary operating systems business is unlikely to remain profitable in the future because of competition from free software” [wikipedia]. Clearly, that came to pass. However, it also gives a good history of…


Evans’ Procedure for predicting the value of new technology, and its applications to CarLabs

This essay is an attempt to turn Benedict Evans’ recent article, Not even wrong — ways to dismiss technology, into a procedure. It attempts to answer the question posed at the end of the following quote:

“It is unquestionably true that many of the most important technology advances looked like toys at first — the web, mobile phones, PCs, aircraft, cars and even hot and cold running water at one stage looked like faddish toys for the rich or the young... But it’s also unquestionably true that there were always lots of things that looked like toys and never did…


For some reason (🤔), a large portion of my friends are suddenly interested in basic security, such as encryption. This has forced me to reevaluate my own web security hygiene, because I love teaching and want to give my friends the best possible advice.

I will be the first to admit, the sum total of what I know is not even step 1, hence the title. However, I don’t know anyone who isn’t a sysadmin who even follows these basic tips and principles, so here they are. I hope you find them useful.

Assumptions

You want to make it harder for…

Sam Havens

Natural Language Processing, ex- math and physics. Director of Data Science at Writer.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store