Book 4: Everything
From the Diaries of John Henry
Book 4 was the year I started taking incremental software refinements from weekly rollouts to a much more rapid pace, and the opening essay’s disorganized structure I think was nice counter to the close of Book 3. The embedded tweets were the start of a practice of sharing rollout notes on twitter that has continued to this day.
The Legend of Bagger Vance reviewed a movie that was special to me.
An intro to Automunge was the first that had attempted a full walkthrough of various library parameters. It is a little dated from current implementation, but is still a helpful resource to get acquainted with a few of the options in library.
October was a great month for me. Write-ups included discussions inspired by a machine learning conference, a Kaggle competition, a New Orleans trip, and closed with a neat discussion around current events in quantum computing.
November was productive. Munging Supremacy, besides introducing a new format for feature transformation demonstrations, also included my first piano recording. A few of these pieces kind of deviated from the essay form, for example Pitch Deck was exactly that. It was during this month that I learned that I would be able to attend the NeurIPS machine learning research conference, my first research conference, and Machine Learning and Climate Change was an effort to collect some thoughts in preparation for the Climate Change and AI workshop.
December was the month that I started to have vision problems, just sharing for context, thankfully now have a better grip and no longer a major difficulty. The experience at NeurIPS was quite memorable, hard to describe how intellectually stimulating it was to be surrounded by the sharpest minds advancing state of the art in machine learning and artificial intelligence, and started a practice I’ve maintained since of periodic review of papers and lectures from machine learning research conferences. Even though most of this material is adjacent to the scope of Automunge, it still helps to learn formal writing practices and research techniques, and I generally keep a log of reviewed papers and takeaways for future reference or citation.
January was a resumption of coding focus, although I did take a week off for research surrounding the essay Probabilistic Programming Possibilities.
February turned out to be a bit of a transition point. The Automunge Explained videos were my first attempt at the video marketing channel. The reception has not been significant, but the shorter version is still a prominent feature of the Automunge home page. And of course this is when the world began to change.
In March my main blogging contribution was a formal writeup surrounding the Automunge family tree primitives, which I think may still be a helpful resource for those trying to understand our conventions for specifying transformation sets. This was also the month that I started recording etudes.
April was an important month. With the lockdowns in place these two publishings were some of my more sophisticated, one from a creative perspective and one from a formal technical perspective. The Perceptions of the Blind was my first formal album of recordings, featuring music composed by Philip Glass, and in additional to these videos the same recordings are also now available on various music streaming platforms. The Automunge paper was my first formal writeup for the library in the conventions of academic publishing, although the content probably had more similarity to a marketing overview than any kind of research report. In academic papers since I’ve tried to include some form of experiment or validation to better align with research conventions, I think that is what was missing from this paper to make it publishable.
The ICLR conference in May was the first major machine learning research conference to go virtual, and every one that I have attended since has largely followed the format of this event. It was quite different than the experience at NeurIPS, the amount of interactions and networking opportunities were greatly diminished for instance. But in parallel the pace of paper reviews, poster sessions, and lectures were accelerated owing to the convenience of point and click navigation. So there were tradeoffs. I wouldn’t want every conference to be virtual, but I think there is a place for this format going forward even independent of social distancing concerns. Oh yeah and in this month I wrote another academic paper, this one surrounding a new form of automated categoric encodings available through Automunge.
In June, parallel to the published papers, I also continued further refining the categoric encoding paper. The academic papers that I have submitted to conferences are unique in these books in that the iterations and refinements continued after initial publishing dates, although have followed convention of logging edits on twitter. This practice was motivated by trying to reach the highest standards of content for research. What Democracy Looks Like was pure essay, and Automunge Influence was pure Automunge.
The final month of July ranged from content targeting those new to data science in Getting Started with Data Science to sophisticated researchers with Ensembles and Ensembles of Ensembles. This is in line with the Automunge goal of a common resource suitable both for novices and advanced practitioners alike, which we’ve tried to balance by offering a push button solution under automation integrated into a platform for custom engineered data pipelines. The closing essay I think was a nice counter to the opening of the book.
August 2016 — July 2017
August 2017 — August 2018
August 2018 — June 2019
Book 4: Everything
September 2019 — July 2020
August 2020 — August 2021
September 2021 — August 2022