A physicist’s holiday to the tech world.
Documenting my internship experience at the FT.
Most probably originating from my love for anything with an LED screen as a kid, I have always been enthusiastic about computers and tech. Naturally from that, as I approach the part of life where I need to start thinking hard about career paths, comes an interest in what it would be like to pursue a career in the field. The modern physics degree offers students plenty of opportunity to get some coding experience. However, it often does not stray further than number crunching, and plotting some graphs. I sought an opportunity to get some hands-on experience with the kind of programming that a software engineering job entails. This way, I could find out if aspiring to a career in tech because of an interest in consumer gadgetry was a naïve stretch. Fortunately, I was able to get just that with a month of my summer break in the Edge Delivery & Observability team at the FT. I came into this intern role knowing some Python, a few computing related buzzwords and whatever any ten-minute read could give me on the function of DNS. As a result I had very little idea of what to expect.
On my first day, I was given a printed GitHub thread which laid out the basis of my project. It gave me both the task, and the motivations for doing it. Most importantly, it showed me that there were people who desired the end product of the project. My work was actually going to help people in the FT! My task was to build on an existing DNS pull request (PR) approval bot, which had been developed by the Edge Delivery & Observability team some months prior. Specifically, I needed to program a new rule for the bot to police — scanning PR’s and asking users to drop time to live (TTL) values before adding or changing a record in the DNS. A record in the DNS simply refers to one address and its corresponding name. Akin to a word in a dictionary, and its stored definition.
The contributors in this GitHub thread complained that in many instances; people would make mistakes when adding or modifying records in the DNS while leaving a high TTL value. The value defines the length of time a DNS resolver caches a record. Any DNS changes made within this time frame cannot take effect until the TTL has expired. So, let’s say we change a record that has a TTL of one day. Once the DNS resolver cache updates, it can hold that version of the record until it updates again. If the change made to the record does not produce the desired effect, then one could be stuck with a faulty record for the duration of the TTL. In reality, there are many DNS resolver caches operating on different clocks, all updating at different times. Only some caches may be stuck with the faulty record for the full TTL duration. But in development; low TTLs are desirable because in the instance of anything going wrong, all caches will expire after a short amount of time.
The script I programmed functioned as part of the existing DNS approval bot. The methods for scoping PR’s, posting comments and approving requests were all very conveniently laid out for me upon arrival. This meant that the task I had was in-fact more logic based than Python skill. I had the building blocks, it was just a case of finding out where I should put each one to end up with the finished product.
When I first began to look through the scripts, I realised that I was going to be working with a very different style of programming than I was used to. There were all these terms being thrown at me: serverless, C.I. pipeline, source control, the list goes on. The first four days were spent reading around the brief, reading the scripts and googling anything I didn’t know. I also spent one morning learning the fundamentals of source control with a bit of git, after I had waltzed into the FT with a GitHub account but no real knowledge of what on earth to use it for. I opted to take handwritten notes on anything I thought was important, and to map out the logic of the problem at hand. Apparently, using pen and pad to work on programming problems is not the most common approach these days — perhaps that’s the physicist in me coming out a tad. However, exhausting the logic of the problem and getting it all down was a massive help when it came to finally writing code. I knew exactly what bits of logic I had to write myself, and what bits were simply a bit of copy and paste from other sections of the program. While the programming style at first was unfamiliar and a bit daunting, it actually made my job tenfold easier as it was very easy to build upon.
The DNS approver bot works via executing actions — which are chosen based on rules. The program will execute each modular rule script, where it will determine if the PR has any actionable criteria. For example, the TTL Approval rule scans to see if the only change made to a record is its TTL value. It also scans to see if the value chosen is unsuitable. The action is to approve the PR if there are no issues, or leave a comment if there are. The first image shows my handwritten notes, identifying what rules did what actions. There is also a breakdown of the simplified example code which illustrates how the bot’s logic functions. The second image shows my list of all events that could come up. With these events written out, I could write some pseudo code. When it was time to finally start writing some real code, I upgraded from the notepad to a new GitHub repository. This became the dumping ground for all new things I learned, and any developing code blocks.
My initial approach was to merge two of the existing rule scripts and build my rule on top. The Auto-Approval rule had the logic to cycle through added records of interest and give them the thumbs up. The TTL-Approval rule had the logic to do a TTL check, so to get to my end-goal all I needed was a couple of: “if TTL check returns more than the desired threshold then post comment” lines in the correct places. Once I got to testing what I had put together, it became apparent that I may be fighting an uphill battle. The pre-written tests were not expecting to have all these decisions coming from one script. So instead, I created a new rule and made new tests specifically for the new logic after I received some key advice. Getting the new method coded, then in and out of testing, went significantly faster than expected. Small details in the existing program made my life a whole lot easier. The “handled” attribute marked if one of the rule scripts had taken a look at a record change (or addition) and decided what to do with it. The effect of this is that there are not multiple actions occurring on a PR if it gets flagged by the rule scripts in more than one area. As a result, as long as I chose the correct order to execute the rules in, using “handled” meant that my rule and the existing ones never got in each other’s way.
One of the hardest parts of the project was deciding on how to word the comment. It was decided that the comment needed to be informative but also to the point. However, difficulty arose when trying to explain when using high or low TTL was appropriate in a concise manner. I had to rely on the wisdom of my teammates heavily. Initially, it was planned that my script would now prevent FT devs from adding new records with a TTL below 10 minutes. The idea was that devs would add a record, then wait the (short) duration of the TTL to make sure nothing was going wrong, then bump the TTL back up to a suitable level. However I was advised that my work could be counterproductive. Some users may add a record, but forget to bump the TTL back up, leaving the DNS littered with very low TTLs which are more expensive to maintain. Instead it was decided that addition of records with high TTLs would be permitted, but the comment would try and encourage good practice. Below is an image of my new comment feature in use for changing a record.
Working in a tech team has been a fabulous experience. There’s a real sense of community spirit all over the third floor, which has made my time working at the FT all the more enjoyable. Daily huddles to see what everyone is getting up to, fortnightly socials to have a bit of time away from work on a Friday afternoon (while keeping the conversation topic strictly work related — of course), and more department-wide discussion such as “It’s good to talk” gave me fantastic opportunity to get to know many faces around the office. It made the whole induction experience considerably less daunting. Everything done in the Edge Delivery & Observability team feels like a real team effort, where I expected it more to be a collection of individuals all working on their own stuff. It was nice to know that I could lean on anyone in the team if I sought help on my project, or even anything outside of programming.
Overall the internship has offered me much more than I anticipated. I thought that I would leave my month at the FT with some slightly more brushed-up Python skills and perhaps a better understanding of DNS. Now I have been fully introduced into the world of software engineering, I am much more confident that I would find enjoyment in a tech career.