Data Structure 101: Part 3
This is the 3rd post in a series about Data Structures.
In Part 2 we learned how to evaluate Data Structures and algorithms.
In this post we will talk a bit about lists, which are one of the most used data structures. Then we will explore some algorithms to deal with them.
Lists (also called arrays, or slices) are a linear structure, an arrangement of data that can be accessed by a numerical index. Let’s see an example for that.
This is our list, let’s call it N, and N, is a list of strings.
If we want to access the ‘E’ value, we should access it by its index, which should look something like N.
“Wait, why 4, and not 5 which is the position in the array?” you may ask.
This is something that came from low-level coding languages, like C for example. When we create a list in those languages, it allocates a linear slot in the memory, and points to the initial address of the allocated slot, like address 1024. And to access the desired element, we use the index because the number that represents the index is added to the initial address of the list, and with that, they find the memory position where the list was prior allocated. In our example, if the first element of N is allocated in memory 1024, to get the ‘E’ value, we should access 1024 + 4. You can see an example below, look how N allocated spaces from 1024 to 1031? The index starts from 0 and goes all the way up to the end of the allocated memory. This was used to optimize access to memory, and is very used for low-level languages until this day.
This is how lists work on behind the scenes. At the high level, you just need to remember that lists can be accessed by using index, they use a chunk of the memory to work, and they usually have methods to add items, filter by a condition, navigate/loop through it, and remove items.
Let’s see some examples in python.
If you have read the prior posts you may remember we talked about Big O notation. And you can check that with the loop and/or filter you have to navigate through all the elements inside the list at least once. For that reason, those kinds of codes are categorized as O(n), on the Big O notation. For the Big O, your code is categorized by its worst case, and the worst case in this scenario is to iterate through all element.
To optimize those kinds of routines, we have some extensions of the base list, which implements some of these rules:
- Rule of insertion
- Rule of deletion
- Rule of management
Let’s check some of those extensions below.
FIFO (First In, First Out)
Also called queue, it’s a structure that implements two rules: 1) To insert an element to the queue it must be added at the end of the list, and 2) To remove an element, you must remove the first element that entered the list.
As an example:
They are used a lot in applications that have to process data in the order that it was input, like a cart in an E-commerce app or a service queue.
FILO (First In, Last Out)
Also called stack, like the last example, it implements two rules: 1) To insert an element it must be added at the end of the list, and 2) To remove an element you must remove the last element that entered the list.
Let’s see an example below.
Stacks are typically used to deal with call stacks in coding languages.
Sorted list: What is that?
Sorted lists is one that implements a rule o management.
As the name suggests, to create a sorted list, we need to run a routine after the addition of an element. This routine sorts elements following some rule, like the element with bigger value will be the last in the list, or any order that has a consistent rule.This may seem to be something strange because it increases the time complexity of the add method. For example:
The reason is to optimize the search method. On normal lists, to search an element you have to navigate though the role list as we saw on the previous examples. This is usually very slow. If we need to search more than add, this is a great solution. So, it’s worth it to increase the time complexity of the add method, in exchange to have less time complexity on searches.
Let’s see how the search can be optimized, by learning a great algorithm, the Binary Search.
Sorted list: How to search with Binary search?
Binary search, is an optimized algorithm to search elements, and it only works for sorted structures. It works by dividing the list into to portions. You check the middle element of a portion and if the element is found then you return it. If not, you check if the element is bigger than the middle element. If so, you repeat the function by creating a portion on the right side of the middle element, and if it is smaller than the middle element, you look on the left side.
See the code below for an implementation example.
Here is a GIF comparison of how that’s work.
Binary search is very optimized, reducing the steps to find an element a lot. If your system uses search more than it add elements, consider this algorithm.
In this post we covered lists and algorithms to add, remove and sort data.
On the next post, we will explore about trees and will learn how to do binary search inside trees, sort trees, and what are balanced trees.
Want to add something for the discussion? Please leave a comment.