Programming Taxonomy

Stef Nestor
5 min readApr 12, 2022

--

Categorizing common code languages with their history/future.

Categories

Every couple hours, someone googles or asks Quora “is HTML a Programming Language?”. Their confusion comes down to jargon: it is a coding language, but it is a markup — not a programming — language. Coding languages form into five general classifications: program, markup, query, syntax, and serialization; however, frequently the parent category is loosely described as just programming since it makes up the vast majority by volume.

code classification table

(Note: Most people no longer suffix “serialization” with “language” rather than let it stand on its own. Pre-processors can sit on top of specific languages, but are not a separate category. I’ve highlighted in blue the languages I believe folks are unable to avoid while working across the tech industry.)

An example of all categories building into a functional site:

  • Medium accepts Markdown pasting, automatically processing it into HTML. Medium restricts user’s CSS options to guarantee readability and appearance.
  • Medium user login (most likely) works by SQL searching a username’s hashed password to compare to the currently entered password hash. To prevent SQL injection attacks, Medium will sanitize the username before using it (which is probably not, but can be done with Regex).
  • Once a user is successfully authenticated, they can view posts (JSON format returned via GraphQL lookup). When a user clicks to edit their post, the back end NodeJS (a type of JavaScript) will query the database and forward the current saved post to the user’s browser.
  • All of these actions on Medium will generate logs that Medium Developers can investigate for site debugging/optimization. Usually logs will be parsed with Grok to make investigating more intuitive.

In my experience: Syntax, serialization, and markup languages are generally led by one/two certain languages and are for specific use cases (e.g. HTML is for PDF printing or website content). Query languages start to build some complexity based on what type of data and how complex its querying, but generally format close to SQL/GraphQL (or have plugins to those two). Programming languages, having the most widespread use cases, have the most variance in formatting, learning ease, speed, and organization.

History

Compared to other human inventions, coding is very young. What started with Fortran in 1954, has already exploded into a myriad of languages. Its first 20 years saw more than 15 new languages. Programming languages were born more frequently over the next 40 years. Most languages created after 1970 are still alive today (see here for original).

Future

Starting in 2000 and taking off in 2010, the world has turned from an “Internet of Things” economy into a Data economy. This increased the value of previously only (data) science languages, such as R and Matlab. Python and Julia are built to support easy science library plug-and-play. Mobile’s historically ran on Java, C++ (or other “C” languages), and Swift, but this appears to be partially shifting with JavaScript, Kotlin, and Dart.

This doesn’t mean everyone needs to learn every language. Frequently, one’s coding language will be determined by their company or industry. Therefore, it helps to think of languages as variant ways of accomplishing the same pseudocode logic. (There’s some pretty cool websites to this end, which document how to do the same task in multiple languages, e.g. web scraping, hello world, fibonacci numbers, and concatenate strings.) Thankfully coding jargon translates across languages, making picking up a second and third language much easier than the first (per category).

I infrequently check coding language trends (top industry trusts: RedMonk, StackOverflow, TIOBE, and PYPL). I make sure I know the basics (and I mean basics) of the top five languages, regardless of categorization. (Basically anything on W3Schools top bar.) Github publishes top languages by PR (which shows the fall of Ruby and rise of TypeScript last 10 years). (See live by quarter.)

StackOverflow publishes top languages by questions. (Note: This latter data may reflect bandwagon effect.)

They also publish top paying languages. (Note: May be $$$ because supply has dropped off for legacy demand, doesn’t guarantee future value.)

Application

Big-project languages should be chosen via full cost-benefit comparison. I’m fairly sure they, and definitely personal projects, are chosen by familiarity, community buy-in, and for lacking glaring downsides and that’s usually okay.

As random golden nuggets I wish a Senior Dev had told me earlier in my career

  • JavaScript libraries have a ton of bloat and can bog down your computer.
  • You can learn SQL to the level everyone knows within 3 days. You should still have a Database Administrator optimize Production queries.
  • Data Science is mostly done in Python and R. Julia made a go of it, but not sure it’s catching on. R is not a full programming language, so Production code is usually Python (or R plugged-in to a programming language).
  • Bash works across Operating Systems (e.g. Windows, Mac, Linux), so is most frequently used for Technical Support teams to send code snippets to run on customer computers.
  • Nobody knows Regex by heart and everybody uses online testers and generators.
  • JSON has replaced XML for many use cases, but YAML doesn’t have structure sufficient to replace JSON. Most webpages communicate via RestAPI or GraphQL, returning JSON. Many products allow settings to be in YAML or JSON format.

Happy coding!

--

--

Stef Nestor

Data Security & Architecture, Theoretical & Geo Physics, Bayesian, hiking, hammocks, birdies, dino jokes.