“Zoom Out”: The missing feature of IDEs

Keaton Brandt
Source and Buggy
11 min readSep 29, 2022

--

Your IDE is a text editor. Sure, it has a million plugins, fancy macros, your own special keybindings, and a really cool font — but that’s all, ultimately, in service of reading and writing text files.

Unfortunately, our job as Software Engineers is not to publish beautiful leather-bound books full of code — it’s to ship working products. Those products are made from a web of intersecting subsystems, each one attempting to coordinate while also competing for time and resources. Subsystems are added and tweaked on a daily basis, subject to the whims of stakeholders, all while engineers join and leave the team.

In other words: shit’s complicated.

More specifically, architecture is complicated. Almost every major bug I’ve encountered (or caused) in my career has been a result of two subsystems not working together in the way they’re supposed to. In my experience, it’s actually pretty rare to see a bug on the level of an individual algorithm –when was the last time you saw a sorting algorithm misplace something? Unfortunately, our glorified text editors are really best suited for those low-level kinds of problems. They don’t help much when it comes to analyzing data flow or race conditions, and they rarely help visualize any structures, interactions, or processes. Instead, engineers often have to take things back to the drawing board — literally.

Credit: Richard Masoner (CC BY-SA 2.0)

I personally love the feeling of drawing a beautiful flow chart on a whiteboard. It’s the closest I’ll ever get to drawing happy little trees like Bob Ross. But it’s not a very effective way to solve problems. My half-remembered sketch would probably look a little different from my teammates’, and both would be massively oversimplified. Even UML in practice tends to leave a lot of ambiguity. Ultimately the only reliable way to understand a program is to read and internalize every individual line of its code.

We all just take this fact for granted — as though it’s not totally insane. Imagine if Photoshop made you edit all your images one pixel at a time:

“Zoom Out” is such an obvious necessity for all creative apps that it rarely even appears on a list of features — yet IDEs, which very much count as creative apps, have nothing of the sort. Hell, even novelists have better zoom-out views than programmers: tools like Scrivener let authors maintain outlines, visualize timelines, and cross-reference research, all in addition to normal text editing.

The entire job of Software Engineers is to, y’know, engineer software. How did we end up creating great software for everybody except ourselves? Why are we the only digital creatives whose tools have not fundamentally improved since the days of command-line interfaces (if you don’t believe me, consider how strangely feasible it is to use nothing but 1980s-era vim in this, the year 2022)? How are we supposed to design and develop interconnected systems when we can only guess at how the pieces fit together?

Of course, if there were an easy way to solve the “zoom-out problem” for developers it’s safe to assume that somebody would have done it already. People have certainly tried — NoFlo JS was a personal favorite of mine, but appears to have stagnated. The obvious conclusion is that it’s not such an easy problem to solve. In fact, solving it properly requires rethinking one of the very basic assumptions about programming: that programs are represented as text files.

The problem with text files

Computers really don’t like to read text. It’s an unstructured and relatively low-entropy data format that often lacks important context. For example, a C++ text file may define a variable as auto ptr = &x;, where auto indicates that the compiler should figure out the data type of ptr. This auto represents context that is intentionally left out from the text file in order to make developers’ lives easier. If an IDE wants to provide refactoring on that variable it must simulate the compiler’s logic to determine the data type, which may involve loading and analyzing more and more text files, recursively.

It’s not just automatic types. In Typescript for example, code files don’t contain enough context to determine which other files they import (see TypeScript module resolution). Or, even worse, in languages with template metaprogramming, IDEs can’t make even basic inferences about the code without first running some of it. It’s no wonder IDEs have such limited features — they have to use all their time, and all your RAM, just to get the most basic understanding of what’s going on. We’re making our computers perform incredible feats of algorithmic gymnastics just to save a few keystrokes.

To be clear, none of these keystroke-saving features are necessarily bad. Anything that makes a programmer’s life easier has the potential to prevent millions of dollars of lost productivity — oh, and make us happier (my capitalism-addled brain almost forgot). The problem is that these human-friendly features were implemented on the wrong layer of the tech-stack.

Party in the front, business in the back

Software Engineering tools are hindered by the fact that the kind of code humans like to read is different from the kind of code computers like to read. We can mitigate this with a simple intuition: The code you see on screen doesn’t have to match the code that is saved to the disk.

If you want to use auto that’s fine, but before your IDE saves that line to disk it could resolve the concrete type and add it as extra context to the saved file. In its most basic form that context could just be a hidden code comment, like // auto_type = int*. If any other tools want to refactor your code, or analyze it, or even compile it, they won’t have to repeat the work of resolving the type because they’ll be able to read it from the generated comment.

Unfortunately, this adds complexity to the IDE. What if the type of x changes? In the above example, the type of ptr is tied to the type of x, so the IDE would have to know to update both types next time it saves the file. Ultimately this leads to a tree of dependencies: ptr depends on x, x depends on something else, and so on, and so on. IDEs try their best to figure out tree structures like this, but they’re hindered by missing context and limited resources.

There’s got to be a better way!

Your IDE dealing with your code when it’s low on memory

All valid code can be represented by an Abstract Syntax Tree (AST), which is the kind of thing computers like to read. With some tweaks to the AST concept (such as ensuring that ASTs include code comments), it is possible to re-generate normal human-friendly source code from a binary AST file — it’s essentially just running the parser backwards. There is some overhead, but it’s nothing compared to the cost of repeatedly re-indexing codebases.

We can even go one step further and save all of the files for a project in a single interconnected database, where each tree can directly reference other trees. The low-level representation of MyModule.foo() could just be a “call” operation with a pointer to MyModule’s implementation of foo. Then, to rename the foo function to something more descriptive, the IDE could just update the name in the implementation and all of the references would automatically get the new value. Running code analyses to detect things like unused imports or mismatched arguments would also become trivial, as would code navigation commands like “Jump to Definition”.

Your computer would get to read pre-parsed and pre-indexed trees, you’d get to read your favorite programming language, everybody would win! That’s why I’m #teamtrees.

Programming Languages as UI

In this architecture, programming languages are essentially just ‘renderings’ of lower-level state (ASTs) — just like any other UI. Those renderings could be personalized well beyond color themes and font ligatures. The source code your computer displays when it loads an AST file could be formatted any way you want! Do you like K&R Style braces or Allman Style braces? It doesn’t matter, just pick a setting in your IDE and it’d display them that way. Tabs or spaces? 1 newline after a function or 2? 80 columns or 100? None of this matters at all, so take your pick. It could even go beyond formatting — I don’t like the __init__ keyword in Python, so maybe my IDE would render it as the word constructor instead.

When my IDE saves my code back to disk, the AST format would abstract all these choices away. When you load the code you’d see it formatted in exactly the way you like. The amount of office drama this would eliminate is hard to overstate. Frankly, among engineers, disagreements about indentation style are probably a leading cause of violent crime.

Even more importantly, representing software projects as collections of ASTs that are interlinked and stored in a single database would allow the ecosystem of software engineering tools to work much more effectively. IDEs could open faster because they would never have to re-index the project. Compilers could skip the parsing, lexical analysis, and internal linking steps, which would substantially speed up performance. Static Analysis tools could detect more types of problems by understanding the full context of each line of code. Code Review tools could easily display additional information to reviewers, such as a list of all the places where a modified function is called. When your computer’s life gets easier, your life gets easier.

Aside: Keep It Simple, Stupid?

You’ve probably heard the adage “Keep It Simple, Stupid” — “KISS” for short. Some of you are wondering if I’m the eponymous “Stupid”, which is fair. After all, what could be simpler than writing text files and saving them as text files? What could be more complicated than trying to save them as databases full of interconnected syntax trees? What if something gets corrupted? What if there’s a bug in the AST that doesn’t show up in the text view, how would a developer ever find it? And, won’t this break my workflow where I pipe every code file through a bash script for some archaic reason I can’t remember?

These concerns are all valid (to varying degrees), but they also represent a flavor of Luddism that is unique to tech workers.

But my proposal isn’t so radical! A file system is already a database, as is git. Projects like Kythe are already able to generate and utilize similar data structures to the one I’m proposing, and Kythe in particular has scaled all the way up to Google’s gigantic monorepo. Other creative industries like 3D movie production already rely on giant interconnected databases rather than text files to support their projects. Yes, rethinking the way we store codebases is a risk — but not an unreasonable one.

Beyond (just) Programming Languages

There’s a well-known flowchart from Slack showing the logic that goes into deciding whether to show a notification on each of a user’s devices:

From “Reducing Slack’s memory footprint”

This diagram is an example of a good “Zoom Out” view, but it’s cumbersome. Somebody had to draw it by hand. Somebody has to remember to update it every time the code changes, and somebody has to update the code whenever the diagram changes. The thing is, the diagram is code. “Code” is just a word for a representation of a process — be it text or diagram or interpretive dance. Slack’s flowchart isn’t machine code, to be sure, but neither is C++. Both are different human-friendly ways of representing logical processes.

There’s a mentality among engineers that we don’t need any UI beyond the pure code. We take pride in our simple command-line interfaces because we see them as symbols of our galaxy-brained computer prowess. Opening an app full of buttons and shiny graphics is demeaning to our egos — like asking for a pen and being handed a box of crayons. Or at least, that’s the vibe I often get from reading programming forums.

Here’s Matty Stratton dropping some truth:

Still, engineers often have a lot of ‘inertia’ around the tools they’re used to. Changing over to entirely new types of tools might be almost as difficult as switching careers entirely. Plus, it violates the first law of Engineering: if it ain’t broke, don’t fix it!

But, it is broken! Our code is a mess of tech-debt, our hand-drawn flowcharts are out of date, our products are full of bugs, and the vast minefield of hidden complexities makes it nearly impossible to even estimate how long fixes will take. Frankly, I’m amazed product managers put up with us. This is clearly an industry in need of new ideas.

I’ve talked about how we can think of programming languages as UIs on top of a more abstract representation of a computer program. Doing this unlocks the potential to have a lot more UIs showing different views into the same program. Text files could be tied together into flowcharts, which could be collected into larger flowcharts, or into hierarchies or sequences. IDEs could display a visual graph showing which functions are called by which other functions, and provide search functionality to easily reveal, for example, all the places your UI code makes blocking database calls. New visualizers could help identify potential race conditions by highlighting every expression that reads or writes a certain value. I don’t know how, but I’m certain somebody would find a way to use VR to fly through your package structure or something.

These are not new ideas. Computer Aided Software Engineering (CASE) has been a buzzword since the 1970s. This largely revolves around visual programming languages, which are not exactly what I’m discussing here but are solidly in the same vein. Visual programming languages are widely used in game development, 3D modeling, sound design, science, home automation, and education. It’s starting to seem like everybody has great software engineering tools except software engineers.

Looks kinda like the Slack flowchart, but it’s executable code

Modern software products like apps, backend services, and embedded controllers are orders of magnitude more complicated than the sorts of problems most Visual Programming Languages are designed to solve — but that doesn’t mean we should give up. Indeed, as I hope I’ve demonstrated, these large and complex codebases are the ones that would benefit the most from visual tools.

The hurdles come from the problems I discussed earlier: the slow and inexact indexing of text files limiting how well our tools can work, the inertia of career engineers and their processes, and a general belief that things are just fine how they are. These are fixable problems. I don’t see any technical reason why visual programming languages and other CASE tools couldn’t scale to large teams and complex projects, especially if backed by an AST database structure.

There will be compromises, especially at first, but it’s time to take one step back and two steps forward. It’s time our IDEs let us zoom out, and see the forest for the syntax trees.

--

--

Keaton Brandt
Source and Buggy

Senior Software Engineer at Google (but views are my own). Seattlite. Chihuahua chauffeur. Doomscrolls on Wikipedia.