CRUD Is Dead, Long Live CRUD!
Data Models and API Routes Are Not The Same
The history of IT, though relatively short, is made of cycles, loops or waves. Centralized mainframes that we connected to through passive terminals, then mini, then micro systems, PCs, now every machine is virtualized and hosted we don’t even know where. Languages also evolved. From COBOL, to C, then VB, then Java and C#, now python or TypeScript. But one thing remained pretty much unchanged since the 70s, relational databases and their query language, SQL. Yes we now have NoSQL DBMS, but still most of the designers or developers still think in terms of tables, relations, keys to bind records, columns of similar properties. And the four operations relating to data tables: Create, Read, Update and Delete, the famous CRUD operations.
APIs Are CRUD Extensions
Nowadays, we are in the API era. Every application is designed with a backend and one or several frontends that communicate through an API. The backend wraps up a the database, embeds the business rules and exposes endpoints to outside usages, be they web app, mobile app or external usage by other systems. That is the way we design systems now. But still, we think in terms of entities we call resources and for each of these resources and we expose 5 routes based on HTTP verbs: POST /resources that is the same as Create, GET /resources to list them all or GET /resource/id to get details of one specific resource. But that is the same as Read. PUT or PATCH /resources/id to Update a resource and DELETE /resources/id to Delete one resource. The pattern is always the same. The recipe is eternal and many developers think there is no other way to do it. It’s a permanent basis. All systems are done this way.
CRUD Is Not Always The Most Appropriate Design
Recently I have worked on an Infra As Code API to help data engineers to provision Hadoop big data resources on existing clusters. When designing models we came to having a table for applications, one for clusters, one for instances (an application on a cluster), one for resources (of type hdfs path, hive database, yarn queue, and so on) and one for bundles because several Hadoop resources could be deployed in one go thanks to bundles or sets of resources. I thought it was pretty simple and straightforward. The client key users validated the idea and we started developing this. But when the first true users showed up, they said it was too complicated, they were confused by the applications and application instances (though they were data engineers a many-to-many relationship … what a tricky thing!) and payloads were different depending on the type of resources they wanted to deploy. Drama, crisis meetings, … they eventually decided to create their own API that wrapped up ours. Another example I love is a simple flash card game I am working on to learn foreign languages vocabulary. The requirements are simple. We need a system that enable to enter words and their translations in any languages. Think of it a minute, how would you design tables? A simple option could be 2 tables: One that I call « Concept » that acts as a pivot. This is where we store the nature of the word (verb, noun, adjective, and so on), its definition, one or more images. And the other table, I call it « expression » is where we store how each language expresses the related concept. Doing this way, I can have a pivotal record of a verb that describes the fact of moving in the air and 2 related expressions of « to fly » in the English language and « volar » in Spanish. But if I am French and I want to learn Russian, then I just need to bind « voler » to the pivot and « летать » or « лететь » and the model still works. If I map the two tables and the 4 CRUD operations then I will end up with unsatisfying routes like GET or POST /concept. Worse when calling GET /expression/<id> or GET /expression/<word> , I will never get one pair of translated words in one single call.
User Need First
We have to rethink the pattern that is inappropriate in this case. Start from the usage, what are our users going to need, write down the most common requests they will send to the system and then design routes like GET /translation/<from language>/<word>/<to language> or using the above example GET /translation/en/to-fly/es and this will return a json object looking like this {« translations »: [« volar »]}. The controller function will have to query the database using a join query, that is the job! But it sounds way more easy for a user to figure out what this does at first sight, no? There are thousands of other examples for sure where CRUD is not the solution. If you are designing an Infra as Code system or a management application (ERP, CRM, … and so on) then you will want to create, read, update or delete resources whether they are contacts, bills or products. Client need matches the data model, so in this case CRUDing raw models is the solution that fulfills user need. But do not forget to first write down what the most common queries will be, usage first, and stop always thinking CRUD on the database resources is the only way to do and let front-end developers reassemble the parts so that they can display what the users actually want. Designing your routes to match the front-end usage will lead to more simple code on the front end and faster applications as the complexity is moved to the backend server side where the possibilities of caching or views are easy to implement.