Four Tips for Documenting a Legacy Codebase

Over the past few weeks, I’ve had the privilege of working with Microsoft’s Visual FoxPro. My task was to take a fairly large codebase and document its behavior so my team could rewrite the software in a modern language.

FoxPro is one of the most archaic languages I have ever worked with, and it poses quite the challenge when trying to decode the complexities found in this particular codebase.

Through this experience, I feel like I’ve developed a fairly good strategy for documenting the business logic and data flows in a legacy system. Here are a few tips that I found helpful when trying to tackle a legacy codebase in an unfamiliar language.

1. Find a Good Syntax Highlighter

While many legacy systems may be written in languages that have good syntax highlighting, there may be some languages (like FoxPro) out there that don’t have wide support for syntax highlighting in modern text editors. However, finding a syntax highlighter for a similar language will often be good enough.

In my case, I was using Atom to explore this FoxPro codebase; there wasn’t a FoxPro syntax highlighter built into Atom or any modules that would help me out. However, FoxPro resembles SQL since it has `SELECT` statements and `WHERE` clauses. Using Atom’s built-in SQL syntax highlighting made it much easier to read through the codebase when the FoxPro commands and other key elements, such as strings and numbers, were properly highlighted.

2. Skim the Code for the Big Picture

For me, it’s tough to go line-by-line through a large, complicated code block that is written in an unfamiliar language. Often, for my first pass-through, I skim the code before diving into the details.

While skimming, I try to look for keywords and patterns. For instance, I’ll look for words like “products” and “stores” and for commands such as “INSERT.” If I find these words in close proximity to each other, I can assume that this code might be dealing with inventory of a particular storefront or something like that. It might be a faulty assumption, but it gives me just a little bit of context before I dive into the details. I’m much less likely to get lost in the code when I have a general idea of what it should do.

3. Research Unfamiliar Commands Early

When documenting the behavior of a legacy system, I don’t want to go through a tutorial for a language that I hope I never have to write a line of code in. I want to know just enough to be effective and understand what I’m reading.

Because of this, I’ll come across unfamiliar commands while reading the code and make wrong assumptions about the functionality of the software. Figuring out what a command does based on its context ends up wasting a non-trivial amount of time.

I’ve learned that consulting documentation whenever I come across a command I don’t know speeds up the process of reading and understanding legacy code. By only looking up stuff I run across in the codebase, I’m only filling in the knowledge gaps I need at the moment. This makes learning an unfamiliar language much more manageable.

4. Don’t Go Alone!

It takes a lot of energy to document a legacy system. In my opinion, it’s too much work for one person. I would recommend finding someone willing to sit down and help you get through the worst of it. Luckily, one of my fellow Atoms, Phil, agreed to help me document this FoxPro codebase. Thanks, Phil!

With a pair, you can split up tasks between the two of you. In my case, I was able to focus on navigating the code and think out loud while Phil took notes and made sure my assumptions weren’t too crazy. By pairing together on this task, we could document a few thousand lines of code in a reasonable amount of time. As an added bonus, having someone to talk to through the whole ordeal kept my morale up.

Final Thoughts

Documenting legacy codebases is a very draining task. There tend to be plenty of unknowns when it comes to looking at old systems. However, with enough experience in a tech stack (even an archaic one like FoxPro), the work becomes increasingly productive.

I hope these tips can help you document legacy system more effectively.


Originally published at spin.atomicobject.com on December 15, 2016.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.