The NoSQL Database Every Software Developer Uses

Ustun Ozgur
4 min readJan 11, 2019

--

Photo by Dương Trần Quốc on Unsplash

One of the greatest ironies of software development is that programmers use plain text files for their most valuable assets, the code; whereas they would rarely use that “technology” for storing the data of their products.

Almost every programmer, even the most adamant ones who would never think of using anything other than a relational database for storing data, use a NoSQL database when they are developing software: the file system. And the file system is messy and unstructured.

Almost every programmer, even the most adamant ones who would never think of using an untyped or dynamically typed language uses untyped things called “files” for storing their code. What is the type of your files? All your files have the following type: String!

Photo by Steve Johnson on Unsplash

All files have some metadata as well. File name is the most common metadata. A file’s name is not inside the file. Neither its creation or access dates. And that’s more or less what this database we store our code on, the file system provides. Could you do better? How about if every symbol in our code had some sort of metadata? Very few languages have this, and even the ones who has this builtin like Clojure rarely make use of it.

You might argue that Git is sort of a database for storing the code, and it is true to a certain extent. It stores the data after some transformation and optimization, and it stores older versions of the data, but that’s pretty much about it. As metadata, it has “tags”, but tags are metadata for commits, not specific files. As a database, Git lacks lots of functionality that ordinary, relational databases provide.

Think about all the code organization a software developer does: Moving files around, requiring a file from another file in the form of modules or types. Think about the querying we do for our codebase: Searching for a text instance, or getting the list of functions in our code base, or getting the dependent symbols for the current class or function. These are mostly done in a half-baked and on-demand manner. Want to get a list of functions: Let your IDE or language server parse your code, and present you the options, only for it to forget it after all. It is almost like the metadata is being regenerated all the time via the compiler.

Think about all the discussions about code organization: Should this file go into this folder, and that folder? You have one choice! You are making a “write” operation that strictly limits your “read” operations.

Whenever you double click a folder in your system, you are effectively doing a query for that table. Whenever you enter another folder in that folder, you are doing a subselect.

Do you want to store additional data about a specific file? Where do you put it? Inside comments within that file? Wouldn’t it be better if you could put that data as a real metadata so that you could search for it?

Wouldn’t we be better of if our code was stored in databases? I am not advocating just relational databases, but even a proper JSON-based NoSQL database might be better for some query operations.

Downsides of this suggested approach: You lose most of the functionality of Unix tools. In Unix, everything is a file and it is a good thought that has served its purpose and is holding us back at this point.

Of course, none of this is new, and the reason I’m writing this article is nothing other than to spark some discussion around the subject. In further articles, I would like to expand on this topic and see what we can come up with. It seems like we have reached a local maxima that presents us from reaching greater heights.

Just imagine how better it would be if we had treated our code as we treated our data!

Photo by Christopher Burns on Unsplash

Some useful pointers for further research:

See also:

About me: I am a software consultant and engineer, specializing in modern Web applications, with extensive experience with software architecture, backend and frontend technologies. Follow me on Twitter at https://twitter.com/ustunozgur

--

--

Ustun Ozgur

Software Engineer. Founder of Rugila Technology Sofware Consultancy. Previously built SellerCrowd.