Book 5: Etcetera

From the Diaries of John Henry


The essays of Book 5 I think did a good job keeping balance between Automunge and other interests, with treatments ranging from machine learning, physics, music, quantum computing, and even a little space sprinkled in.

The book starts off with an introduction to a prominent foundation model for natural language processing targeting a mainstream audience, and then, just in case any mainstream audience decided to come back, quickly followed with a dense academic survey of the Automunge library for numeric transforms to scare them off again.

The October triangle countered three points of interest, physics, politics, and Automunge. I owe a thank you to Wolfram Physics for permission to use some of their illustrations, and their inclusion of the essay in their list of recommended reading was a nice gesture. I try to keep my political commentary sparing, the pending election was of sufficient importance that I decided to weigh in. The Automunge Inplace essay was a nice balance between current events and software, I think it captured aspects of “building a company out of the garage” that am still going through to this day.

November was a learning experience on how to get a paper accepted to a conference. I had submitted two formal write-ups to the ICLR conference, and I think my interactions with reviewers went a little differently for each, although both ending with the same result. I owe a thank you to reviewers for helping me strengthen my work, their feedback was a wakeup call.

December was a big machine learning month. Reading Comprehension with GPT-3 was inspired by a few workshops at NeurIPS 2020. Deep Regularization was just me offering some hypotheses for elements of deep learning theory — if any researcher might be interested in corresponding experiments I would be happy to offer further input.

The January essay QML turned back to quantum computing. I owe a thank you to the TensorFlow Quantum team for permission to use some of their illustrations. I have generally tried to balance my explorations between machine learning and quantum computing, I find it helps not to get too bogged down into a narrow window of focus, probably why I have never gone for a PhD. You are of more value to a field when you get exposure to best practices and horizons of different domains and disciplines. If you only know everything there is to know about a widget, you’ll likely only ever improve that widget marginally.

In February I started with Selected Works, sort of a best of list. I don’t know if this is necessarily the best of for all audiences, this was probably best for a particular audience. The essay Open Source Blogging with Automunge turned out to be the first paper I had accepted to a mainstream machine learning research conference, although not in the formal proceedings — just a workshop on rethinking ML papers. I don’t think this is anywhere near my best work, I suspect the primary value of this paper was as a list of selected works from the citations. Missing Data Infill was my first attempt to experimentally validate performance benefits associated with aspects of the Automunge library, and went on to serve as a basis for a later more formal writeup.

March was a little bit of an intermission, the first two essays were basically book reviews, including on topics like space and quantum computing. And then an Automunge essay to round out the month.

April was sort of a typical month. Lot’s of software updates, and a little bit of Automunge musings. I think the reinforcement learning writeup State of the Art of Reinforcement Learning turned out well as a survey of current research in the domain.

May’s paper Missing Data Infill with Automunge was I think my best formal writeup yet, and covered in some depth a key component of the library.

June’s essay The Pitch was basically a pitch. I tried to narrow the messaging for the Automunge value proposition down to the simplest elements, so if you ever want to share some snippet of Automunge material with someone who you think might benefit from the library, this is intended as a resource for exactly that.

In July I tried a few different forms of writing for variety. I have become comfortable with the aphorism, perhaps partly from so much time on twitter, and Aphroisms or Else was a nice creative outlet. The essay Data Structure was a repackaging of some existing functionality in the library, presented in the form of code comments — which in my experience have a grammatical structure all their own. Yeah and then just some musings inspired by an IEEE standard.

The concluding chapters were somewhat important to Automunge. A Library of Contributions offered some novel extensions to tabular learning. Custom Transformations with Automunge demonstrated the simplicity of integrating custom defined transformations into the Automunge library. On the Radio was for me.

Book 1: Explorations

August 2016 — July 2017

Book 2: Essays

August 2017 — August 2018

Book 3: Entrepreneurship

August 2018 — June 2019

Book 4: Everything

September 2019 — July 2020

Book 5: Etcetera

August 2020 — August 2021

Book 6: Endurance

September 2021 — July 2022


