Data Structure 101: Part 2
This is the 2nd post in a series about Data Structures.
In Part 1 we learned the importance of studying Data Structure. In this post we will learn how to evaluate Data Structures and algorithms.
What is Cyclomatic Complexity?
Cyclomatic Complexity (CC) is a metric used to calculate the "time" an algorithm takes to complete its interactions. The word time is in quotes because it isn't exactly the time in seconds, but the number of commands, function, modules, and conditions an algorithm will take to complete a linear path.
Try to imagine you want your code to get all links from a page, and then open a new tab with all the links you got. To simplify, here is the code:
A control flow chart for this example is:
With this in mind, you can calculate the CC byfollowing this formula:
M = E — N + 2P
- E: Number of the edges in the chart
- N: Number of the nodes in the chart
- P: Number of connected components in the chart
In the flow chart above we have:
E = 5
N = 5
P = 1
So, M = 5 — 5 + 2 = 2;
You might be thinking, "Okay, but what do I do with that number?"
The result for M shows if your code needs to be prioritized for refactoring.
≤4: is considered good
≥5 & ≤7: is considered medium
≥8 & ≤10: is considered high
≥11: something is really wrong here
If you need to refactor, try creating the CC score for its routines first. It's helpful to know where to start, and know if you made progress. You can compare the before and after, and see the improvements made.
There are other metrics to measure complexity.
One of my favorites is Time complexity.
What is Time Complexity?
Time Complexity (TC) is somewhat more straightforward. As the name suggests, it uses time as the main metric, but it’s computer time, calculated by the number of operations performed by an individual algorithm. Since we have a lot of variability involved in an algorithm, going all the way from the input we receive to the conditions we have, we use three cases, the worst case, the best case, and the medium case.
Using the code shown before, let's try to figure out what those cases are by example. The concept for the best case is what input must be given to the algorithm to finish running and consuming the fewest amount of computational time. In the algorithm given above, the best case is to have no link because you will look for links in the DOM, and skip the for loop.
And the worst case? This would be a huge number of links so the code will take forever to complete (and probably crash the browser). And the medium case is somewhere between these extremes.
Time Complexity usually uses Big O notation. The Big O notation uses N as the input, and for the Time Complexity, the worst-case is used to summarize an algorithm. For the example shown above, the category of the algorithm is O(n) because it will run for each element in the input. If you have any questions about Big O notation, leave a comment and I’ll respond.
In a future post I intend to give some more examples.
In this post we learned two essential concepts about Data Structures. If you have anything you want to see in future posts, please leave a comment.