Introduction to Data Structures

Aakash Varshney
The Startup
Published in
8 min readJan 17, 2018

What is a data structure?

Upcoming sophomores majoring in Computer Engineering, Software Engineering, or Computer Science who have signed up for Intro to Data Structures class often ask the same question: What the heck is data structure?

The simplest way to answer their question is by describing data structures as a way of organizing and storing data so that operations can be performed efficiently. This leads us to the question, what kind of operations are we talking about? Accessing, inserting, deleting, finding, and sorting the data are some of the basic operations that one can perform using data structures. Not all data structures can perform these operations efficiently, that’s what led to the development of different data structures. Let’s say you are to find a specific book in an unorganized library, that task would take an enormous amount of time. Just like a library organizes its books, we need to organize our data so that operations can be performed efficiently.

If you have previous experience with programming, you might know what a pre-defined data type is. A pre-defined data type such as Integer, Strings, Boolean is used to create the structure of data. The use of data type is based on the user requirement and the kind of data they want to store. Now let’s dive into the data structures that are used in our day to day programming, and we will also look into the support for the data structures for different programming languages.

One Dimensional Arrays

A basic data structure that one uses in a day to day programming is an array. An array can hold a fixed number of containers to store data and operations could be performed on that data according to the user’s needs.

In the picture above, an array is defined with the name arrayName and has an integer data type. The number below in the bold shows the memory address each container of an array is assigned to. The important thing here to remember is an array index always starts at 0 and ends at (total array size -1). So, let’s just say you defined an array of length 5, arrayName[5], then the indexes of this array would be arrayName[0], arrayName[1], arrayName[2], arrayName[3], arrayName[4].

Multidimensional Arrays

A multidimensional array is just an extension of the normal array, where we define an array with a row of containers. In the case of a multidimensional array, we have rows as well as columns. In order to access an index, we need two numbers to get there.

In the picture above, we defined an arrayName[3][4] with 3 rows and 4 columns and in a total of Row * Column (3*4) indexes. Multidimensional arrays are sometimes referred to as tables or matrices because of their structure.

Dynamic Arrays

Most of the programming languages allow you to define dynamic arrays, which means the memory is allocated during program execution. Consider it this way, you defined an array with 10 indexes but, you only need 3 indexes, thus the remaining 7 indexes are laid to waste, thereby consuming extra memory. Dynamic arrays give you the flexibility to allocate memory according to the program requirement.

Support for Arrays for different programming languages:

Java:- There are a couple of classes for arrays, but the best-known one is ArrayList class.

C++:- Vectors are used to represent the dynamic arrays in C++. It can be implemented using std::vector.

Python:- Python has a new data type known as list just like Boolean and Integers. The list data type can be used to dynamically define arrays.

Singly Linked List

A linked list is a collection of nodes that are connected by links. The linked list contains a node which stores the data items and the address to the next node. The first node is usually referred to as the head node and the last node is referred to as the tail node. The pointer of the head node points to the next node and the pointer of the tail node points to Null.

The dynamics of this data structure makes it easier to add or remove nodes. In order to add/remove a node, you just need to keep track of the previous node and the node after and adjust the pointers accordingly.

Doubly Linked List

A doubly linked list is not much different from a singly linked list, the only thing that sets them apart is the pointer to the previous node. The picture below can give you a brief idea of what the structure should look like.

In the case of a doubly-linked list, the previous pointer of the head node points to Null, and the next pointer of the tail points to Null. The previous pointer makes it easier to traverse in either direction. So, node addition and removal become super easy, all you need to do is keep track of the previous node and the next node and adjust the pointer accordingly.

Circular Linked List

A circular linked list is a linked list where all the nodes are connected to form a circle. There is no head or tail node. The advantage of having this type of linked list is any node can be used as a starting point.

Comparison Between Arrays and Linked List

A list is a different type of data structure from an array. Instead of storing data in chunks of memory, the array requires a contiguous memory space. If you add an element to an array, it might change the memory location of the entire array, but the data remains intact. An array gives you the index number, therefore, you can access it directly or sequentially. Whereas to access any data member you must traverse through the entire list.

Support for Linked Lists for different programming languages:

Java:- Java has an inbuilt class called LinkedList which can be used to implement a linked list.

C++:- Like Java, C++ has also a Standard Template Library called list for linked list implementation.

Python:- Python also has an inbuilt module llist which implements linked lists.

Stack

You have probably heard this word in your day to day life, whether it is referring to a deck of cards or books lying on top of a desk. Well, the data structure works in a similar manner as you would perform actions on a real-world stack. For example, you can only add or remove books from the top of the stack.

A stack is a LIFO data structure which means Last In First Out, the last item to come is the first one to go out. A stack can either be implemented using an array or a linked list. Consider a stack of books, the last book would obviously be placed at the top and if you want to remove a book you will remove one from the top. The three basic operations a stack could perform are:

· Push — to input data into a stack and place it on top.

· Pop — to remove the data from the top.

· Peek — to look at the data at the top without removing it.

The picture below shows the effect of these operations on the structure

Support for Linked Lists for different programming languages:

Java:- Java has an inbuilt class called Stack which can be used to implement stack data structures.

C++:- Like Java, C++ has also a Standard Template Library called stack for linked list implementation.

Python:- Python does not have an explicit stack class, but lists can be implemented as stacks.

Queue

Just like the word stack, the word queue is also derived from day to day activities. You have most likely seen a queue of people in a supermarket at the billing counter where the last one to come stands at the end and the first one to come is the first one to get their groceries to check out. In a similar manner, queue works where the operations are performed at both ends.

A queue is a FIFO data structure that elaborates to First In First Out. A queue, like a stack, can either be implemented using arrays or a linked list. The three most commonly performed operations on a queue are:

· enqueue — to input data into a stack and place them at the bottom.

· dequeue — to remove the data from the top.

· Peek — to look at the data at the top without removing it.

Support for Queues for different programming languages:

Java:- Queue can be implemented using List class in Java

C++:- Like Java, C++ has also a Standard Template Library called Queue for queue implementation.

Python:- The queue module in python offers the functionality of the queue class. Since queues can be implemented using a linked list, therefore, list can also be used.

Trees

The tree data structure resembles an upside tree, at the top we have a root node which is divided into a left and a right node. They have links between them, which connect all the nodes. Unlike the linked list where a node can be connected to only one node, a tree could have a node connected to two or more nodes.

Binary Search Trees

Binary search trees are special kinds of trees where the left node is always smaller than the parent node and the right node is always bigger than the parent node. In order to add/delete a node, we have to compare the value with the root node and then traverse to a specific point to insert/delete.

Support for Trees for different programming languages

Java:- Java doesn’t have any built-in class to implement a tree data structure.

C++:- Just like Java, C++ also doesn’t have any container in the standard template library to implement trees.

Python:- Python doesn’t have built-in data structures so in order to implement a tree you have to create the structure manually.

Conclusion

There are some points you need to consider while deciding which data structure satisfies your need. Unless you have a really big set of data, it doesn’t matter which data structure you use. The questions you need to ask yourself before implementing a data structure is:

  • How much data do you need to store?
  • How often do you access data?
  • What is the purpose of the data? Do you need to access, insert, delete, or sort the data?

The algorithms work differently with different data structures and across different programming languages, so you need to understand the respective syntax of the language before implementing a data structure.

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by 295,232+ people.

Subscribe to receive our top stories here.

--

--