Seeing the Forest through the Trees…and Other Common Data Structures
Thanks to the popularity of family tree websites, you likely have a general understanding of the structure of a tree. At each point on your family tree, there is a single person. And each person is composed of a set of data points for name, birth date, eye color, height, etc. But while the attributes maintained are the same for everyone on the tree, the attribute values will differ (many being inherited from the parents). Trees are a perfect data structure for family hierarchy management but they can be used for so much more. In fact, they are a core concept of web development and used to structure every website you visit.
But trees are just the tip of the iceberg. The world is full of LOTS of data. Data structures offer a way to organize all of it so that we as humans, and more importantly, our machines, can understand it. Learning about all the available data structures will give you insight into how best to process your data. I’ll review a few of the important ones here to help you get started. I explain what each data structure is and why you might need it. I’ve also shared some great resources that I’ve found helpful if you want to dig into learning actual implementation. Have fun and get that data organized!
A queue manages a basic ordered list and follows the rules of first in, first out (FIFO) meaning that the first item added to the queue will be the first item to be processed out of the queue.
- Real-world example: You see this every time you go to a grocery store, order food at a restaurant, or wait on hold for your tech support to answer the phone. You are always in queues!
- Benefits: Operates like a basic array but performance is significantly faster as size increases.
- Drawbacks: Due to the core purpose of a FIFO model, you can only add elements to the end of the list and access the first element in the queue. You can never add to the beginning or the middle based to item priority.
A stack also manages basic list ordering but the rules for a stack are last in, first out (LIFO) meaning that the last item added to the stack will be the first item processed.
- Real-world example: True to its name, this data structure resembles stacks or piles you likely have all over your house. Bills stacked up on your desk waiting to be paid, or plates in the sink waiting to be washed are both similar to stack data structures.
- Benefits: Stacks are great for keeping track of what came right before a certain element and controlling strict order of items.
- Drawbacks: Its methods are specific to the LIFO model so if you want to support random access to data, you might want to consider another data structure.
Arrays are list-like objects used to store multiple values. They are numerically indexed and maintain their order.
- Real-world example: Books are arrays. Each page is numerically ordered and you can quickly go to whatever page you want to find the text associated with that page.
- Benefits: The ability to store multiple items in a single variable and access those items with a numerical index.
- Drawbacks: Arrays are a fixed size and any changes to that size, especially with element inserts to the beginning or middle of the Array, will create additional time complexity for reallocation of the sequential elements.
A linked list is a linear data structure that only maintains a head and a tail pointer. It relies on each node within the structure to maintain its relative position between the head and tail by holding a reference to the item that follows it.
- Real-world example: A congo line works like a linked list. Imagine you are holding the hips of the person in front of you. You can’t be expected to keep track of who is behind you or where anyone else is in the line. New people can insert themselves into the line in front of you and then you lose track of anyone that came before them.
- Benefits: Linked lists allocate and release memory as needed by inserting and removing pointers. Unlike arrays, the size of the list is not fixed and its dynamic structure is what gives it performance advantages.
- Drawbacks: Since elements have pointers to their next node, moving backwards in the list is difficult. If this is a perceived issue for your implementation, the doubly-linked list can be implemented which will include a pointer to both its previous and next node, but the drawback is additional memory requirements.
A tree is a hierarchical data structure that has a root node, which maintains a value and a pointer to any number of child nodes. Those children are also trees, which each may have associated children. This repetition makes trees a recursive data structure.
- Real-world example: Other than the family tree example already discussed, a file system on your computer is also a tree with your root directory at the top and various directories below it.
- Benefits: Trees work great for hierarchical data.
- Drawbacks: Traversing the tree requires recursion which can create a lot of processing overhead as the tree grows in size.
Binary Search Tree
A binary search tree is a tree that can only have 0, 1, or 2 children. The value of the node is greater than one of its child nodes and less than the other child node. The left/right pattern of greater than/less than will be consistent throughout the tree.
- Real-world example: Decision-making processes where you follow a yes / no answer to the next set of questions.
- Benefits: Due to its ability to know which direction to follow based on the node’s value, search / insert / delete operations are much faster than on other data structures (in logarithmic time).
- Drawbacks: If the tree is not balanced when it’s built, you will not see the full benefits of processing power.
A graph is a tree structure with less restrictions on the parent / child relationship. Nodes can be connected to many other nodes by edges, which can establish one- or two-way connections.
- Real-world example: Social media networks like Facebook utilize a graph model to maintain who you are connected to in your circle of friends.
- Benefits: It is easy to store information about relationships between different objects.
- Drawbacks: There is a high overhead in processing power to maintain the list of edges.