What makes your web app faster? (part 1, databases)
There is no sure shot way to make your app faster, it is a performance boost at each level of your app. Your database, server, caching, client app, user’s browser they all plays a major role in providing that smooth user experience.
So, let’s begin.
I will proceed like, databases → caching → client app → browser.
Today’s databases are fast, secure and has a lot of perks we do not even use.
Databases can be divided into 3 categories (there are more):
- Relational Databases
- noSQL Databases
- Graph databases
And they all have there their purpose in software world and you cannot pick a noSQL database when your application need transactions or a relational database if you need to compute a deeper relation between two separate object.
1. Relational Databases
If you want a great relational database you cannot go wrong with amazon’s Redshift or if you want something at your own system you can use postgreSQL which has been industry standard for a long time. (I use neither, because i don’t need one).
Taking a step ahead, there are some new databases making buzz with their capability to use GPU (Graphics processing unit) for operations, performance results are killing everything else. BlazingDB was featured on TechCrunch Disrupt 2016 and it is the greatest i have seen so far.
2. noSQL Databases
They are hell lot faster than Relational database because they do not have any constraint system (unique, joins or something like autoincrement) Or any kind of transaction (commits or rollbacks).
3. Graph Databases
They are relatively new in the game and i do not know much about them, Only that they are best if you want some deep analytics/relations between objects. They can filter results based on some crazy combination of links. I personally did not get a chance to use any of these databases. If you simply google “Graph Database” you get first result for neo4j. So, i assume this is the best option available. But still, i do not have any evidence for this being the best.
So, now you know which one is best in what category. Choosing correct database is a crucial part, once you have set your heart on a one it will be very hard to migrate to other one (there are plenty examples set by big companies for this).
Now, all the databases have more or less same performance benchmarks and choosing one will not give you a solid upper hand in performance. So, to get the fastest performance, you have to follow these things:
- Optimize your query: the way you structure your query will play a major role in your database performance. Fine tune your queries.
- Minimal select: Do not select all properties (columns) as a query result. select only those you need at that point of time.
- Distributed database: Distributed database cluster can save your ass by a big margin but they are a little hard to setup/manage. In a simple setup, you can test a master-slave system that can increase your fault tolerance, can act as load balancer on a peak time.
- Computed data: If you can get some property by some simple mathematical computation then do that instead of selecting it from database, like percentage, profit, loss etc. Do not do it for complex operation like generating hash or finding mean of a large dataset.
- Opposite of point 3: Although your machine will be a lot faster than your database but in some cases like fuzzy search (soundex source), pattern recognition etc. Store computed result in the database and query with newly computed property.
Remember the next few golden words for database optimizations.
“Best way to use database is, not using it at all.”
Meaning of above quote will be explained in next part → Caching.