Design the Optimal Schema for Your SQL Database

5 min readSep 3, 2020

Big data is the Gold Rush of the 21st century. Enormous quantities of data are generated every single day. According to Forbes*, “2.5 quintillion bytes of data are created each day.” By 2025, the World Economic Forum** estimates that 463 exabytes of data will be created each day globally. For those of you who remember the pre-streaming days, this is equivalent to 212,765,957 DVDs per day.

As developers, we are tasked with the responsibility of keeping our applications performant as the size of our databases increase. Databases are essential to keeping applications functional. One of the most common culprits of a slow application is an inefficient database design.

Investing time and effort towards database modeling can pay dividends in time saved.

So… what goes into designing an optimized database?

Define a clear purpose for your database.
Understand your users’ access patterns. Prioritize your most frequently run queries and data processing needs.
Determine the relationships amongst your tables.
Normalization vs Denormalization: “Normalize till it hurts, denormalized till it works.” Decide between reducing redundancies versus increasing performance? Oftentimes, the answer is somewhere in between.
Iterate and refine your design as your database grows larger, your app matures, and your use cases change.

Database design is an extremely complex subject with several competing variables. The bad news is that all of the variables mentioned above are subject to change, as databases are a living, shifting organism.

The good news is you can affect all of the above variables by optimizing your schema.

The best way to ensure that your design is optimal is to track, compare, tweak, test, and redesign your schema and queries on a regular basis to keep up with the needs of your evolving application and user base.

The problem is that redesigning your schema is a daunting task, especially when your database grows large and unwieldy. How can we alter our schema, test it, and run our queries against new schema designs? Most developers and DBAs are too busy to deal with this process.

Luckily, SeeQR is a fantastic, open source solution that provides a stable sandbox environment within an offline and secure desktop application.

SeeQR lowers the barriers to entry and shortens the feedback loop. Developers and DBAs can easily spin up new schemas, populate their tables with dummy data, test queries against such schemas, run granular analytics, and compare their test results across different schema designs.

Create New Schema

Simply create a new tab to generate a new schema. Once you hit the + button, the app prompts you to create a new name for your new schema. It will give you the option to either upload an existing schema or input a custom schema from scratch. Loading a schema is as easy as uploading a .tar or .sql file type.

If you choose to input a schema from scratch, you can write your code for the schema directly in the code editor that appears in the modal. Hit ‘submit,’ and a new database will be created for you in the sandbox environment, and you can start testing your queries.

Return Query Data

Much like ElephantSQL, the results of your query are clearly shown in the Data Table so you can ensure you are getting your expected results.

View Query Analytics

Additionally, granular analytics are provided to the right of the query editor, along with a line graph of your queries run within the current schema design.

Compare Analytics Across Schemas

We can also tweak our schemas in a new tab, run our queries, and compare one query from the original schema to the same query from the new schema design. As such, we can easily tweak our schemas and iterate on the design, and watch as we shave milliseconds off each query. This is a powerful feature that allows us to shorten the feedback loop and easily iterate on our schema designs as our application grows.

You can install SeeQR here. This is an open source tool developed under tech accelerator OS Labs. The developers created it as a passion project and are always looking for contributors. If you want to find out more, here is their GitHub repository.

Takeaway

SQL has a thriving ecosystem of tools. Whether or not you choose to use SeeQR or another tool, I urge you to explore different ways to optimize your schema design. Refer back to the variables of database design optimization mentioned above, and ensure you are meeting each bullet point — especially as your application evolves.

With time and effort, you will be able to continuously design a reliable database that suits your purposes. Taking such proactive measures will spare you from the much, more arduous process of redesigning only after performance or data reliability issues impact your application.

Sources:

*https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/#:~:text=There%20are%202.5%20quintillion%20bytes,in%20the%20world%20was%20generated

**https://www.weforum.org/agenda/2019/04/how-much-data-is-generated-each-day-cf4bddf29f/

Design the Optimal Schema for Your SQL Database

Written by SeeQR