Complement your Table with Soft-delete

Novandi Banitama
Analytics Vidhya
Published in
3 min readNov 17, 2019

Many styles come and go for every developer to get their hands dirty when handling data in database. Like me when it comes to store data into database like postgresql or mysql, I usually add some columns like `created_at`, `created_by`, `updated_at`, `updated_by`, `archived_at` and `archived_by` to complement the table I want to make. Its name already describe its function like when and who creates this field, when and who updates, and when and who deletes this row.

Maybe some of you may ask,

why would we do this? It’s too much to store!! Do we really need the archived column????

sample script to create table books

Well, presumably there is no right or wrong on that. I agree because too much data stored can also slow down your query and will impact on your application’s performance. I believe that your shareholder doesn’t want it happen and when it happens your life will ruin to nothing.

But it was, now with the current version, the database can address that issue by using index or try to reevaluate the way you select it. Some shits may happen in the production because of misuse of index or query.

index on column name and column description

Back to why I use that column `archived_at` and etc, the reason is simple. Data is important and so does its history. We need to treat it as important as like we have it before we delete it. Maybe in the future, that data will be needed or required so we don’t have to sweat ourselves to look for that.

Data is important and so does its history.

When you want to delete that row, you just need to update its timestamp to its column `archived_at` so we know when it’s archived. People would call it as soft-delete case means we don’t actually delete that data physically, we can still see it clearly on the table.

consists of stale data

For the query itself, just add where condition `archived_at is null` to look for the active data. From here, we can differentiate which data is still active and stale.

query to find the active data

Well, that’s it. It’s up to you if you want to use or not. It also depends on your needs as well. But, personally, I would recommend to use this. Lastly, the script I used previously is based on postgresql. :)

--

--