Why is there a lack of tooling that helps understand large codebases?

Johan Cutych
3 min readFeb 26, 2020

--

How does the code base work?

There is a lack of good in-depth tolling for understanding big codebases for architects/team leads/developers. On one side we have simple tools for test coverage and on the other side, complex and expensive products like code climate focused on velocity.

But there is a problem left unsolved in the middle. As an engineer working in a bigger team, it’s hard to get an answer about how does part of codebase work. Or why & how & in what places is this piece of the system used.

Keep in mind that you can use static analysis and “find all” to find specific usage of a function but what I am interested in is connecting concepts/modules together and showing their relationships in easy to understand way.

It’s a new decade and I still don’t have a way to have:

  • A clickable diagram explaining all subsystems of my app and their dependencies to one another
  • A simple way to see the high-level flow for a particular feature. User registration code diagram for an example
  • A simple way to see which parts of code are undertested / overtested. Code coverage is a bad metric in general. It does not tell you anything about the popularity/usage of tested code
  • A way to have a good understanding of what side effects will I have to think about before starting the work on a feature. To measure scope and time needed. (Oh shit user registrations need to be exported to intercom…)

What general questions should you be able to answer easily?

  • I want to change the functionality of this feature for new requirements. How can I know all other subsystems interacting with the feature?
  • I want to quickly get a high-level overview of major subsystems. What are the dependencies, are there any hidden connections that people tend to forget? What should I watch out for?
  • How can I easily know where does the feature starts and ends? I want to know how the card payments work. Where to start?
  • How much effort will be needed to refactor this feature? Are there any connections/interloop that I should be careful about?

Real-world example

What parts of the codebase do Stripe payment gateway touch and how is it implemented at a high level? What do I need to watch out for when migrating to Braintree?

Can it be better?

It took me 3–5 months just to grasp the major subsystems and their interactions with each other in my company. There has to be a better way.

I understand that deep and comprehensive docs could be an answer, but really how many companies will do documentation at all levels of the tech stack? Usually what I see is that you have a few people at the company who have an understanding of the architecture and you kind of have to piece it all together from them.

Why can’t we automate at least part of this process? Something simple to give to the developers, so they can get a quick understanding of the features and their relations in the whole codebase.

--

--

Johan Cutych

I found that building a team of people who love to work together is my passion.