On Bags
The data structure, that is
There are many data structures out there with various complex implementations. Each one has its advantages and disadvantages. Of all data structures, however, there’s one that’s probably the simplest way of storing data and super easy to implement — the bag. That’s what we’ll be focusing on in this article. Let’s go 🚀
But what’s a data structure?
So first of all, what exactly are data structures. If you take a look at the Wikipedia page for data structure, this is the definition you get —
A data structure is a data organization, management and storage format that enables efficient access and modification. More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.
If this seems a bit complex, here’s a slightly simpler definition that I think can work well for our purposes —
A data structure is a format for structures that store data
That’s really all a data structure is. A simple example of one is a string. Strings are structures that store data — more specifically, they store characters in a linear seqence. In Java, we can create a string as follows —
String s = new String("Hello World!");
// or
String s = "Hello World!";
Let’s say we want a data structure that stores a linear sequence of anything — characters, integers, booleans, even strings. We call these kind of data structures generic. They are data structures that can store data of any type.
There are many linear data structures that store sequences of stuff, but we’re here to talk about a data structure that stores stuff in no particular sequence.
That’s right — we’re talking about bags!
What are bags?
Most implementations of the bag data structure work quite simply.
Once you create one, pretty much all you can do is add an element, check if an element is in the bag, find the size, and check if the bag is empty.
Here’s what adding would look like in our visual representation —
Here’s what our bag might look like after several additions —
As you can see, these elements don’t keep any order. Unlike strings, for example, bags don’t store elements of data in a sequence.
So there are really only two things bags need to keep track of for the four operations mentioned above to be possible.
How to implement bags
So now that you know what bags are supposed to keep track of and what they are supposed to be able to do, let’s take a look at how we would go about implementing one.
First things first — we need to keep track of the stuff in the bag, and the size of the bag. As it turns out, a simple array will work fine for internally keeping track of what’s in our bag and we can use an integer for its size.
Now to finish our implementation of the bag data structure, we just have four operations we need to implement.
- Add an element to the bag
- Check if an element is in the bag
- Get the size of the bag
- Check if the bag is empty
At this point, the implementations of three of these operations should be pretty self-explanatory.
- To check if an element is in the bag, we can just iterate over our array with a for loop and check if the element matches any of the elements we iterate through.
- To get the size of the bag, we simply have to return the size variable we are keeping track of.
- To check if the bag is empty, all we have to do is check if the size variable is 0 or not.
But how do we add an element?
It might seem that all we have to do is insert the element into the array at position size
. When size is 0, we insert at position 0. When size is 1, we insert at position 1. (Note that we are assuming that indices start from 0 here; if they were to start from 1 we would insert at position size + 1
)
But here’s our problem with this implementation of the add operations — what happens when we try to insert three or more elements here?
The array won’t have space for them. It’s only of size 6. It can only hold data for 6 elements in the bag and no more.
But that’s not how bags are supposed to work! Bags are supposed to allow you to add in as many elements as you want — not just 6.
So how can we solve this? We can’t just have a really huge array and hope that someone using our bag doesn’t need that more space.
A simple solution is to resize our array whenever we run out of space.
We can simply create a new array twice the size (or 4x size or any factor you want) and then copy over everything from the original array to our new resized array.
And that’s really all there is to the bag data structure. 👏 Bags are one of the simplest data structures out there and now you know how to implement them! 👍
(If you want to take a look at a coded implementation for one, here is one I did in forty lines (!) in the Lua programming language)