On Bags

The data structure, that is

Caleb Winston
Index Zero
5 min readSep 10, 2018

--

A bag found in the wild

There are many data structures out there with various complex implementations. Each one has its advantages and disadvantages. Of all data structures, however, there’s one that’s probably the simplest way of storing data and super easy to implement — the bag. That’s what we’ll be focusing on in this article. Let’s go 🚀

But what’s a data structure?

So first of all, what exactly are data structures. If you take a look at the Wikipedia page for data structure, this is the definition you get —

A data structure is a data organization, management and storage format that enables efficient access and modification. More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.

If this seems a bit complex, here’s a slightly simpler definition that I think can work well for our purposes —

A data structure is a format for structures that store data

That’s really all a data structure is. A simple example of one is a string. Strings are structures that store data — more specifically, they store characters in a linear seqence. In Java, we can create a string as follows —

Let’s say we want a data structure that stores a linear sequence of anything — characters, integers, booleans, even strings. We call these kind of data structures generic. They are data structures that can store data of any type.

There are many linear data structures that store sequences of stuff, but we’re here to talk about a data structure that stores stuff in no particular sequence.

That’s right — we’re talking about bags!

A representation of an empty bag

What are bags?

Most implementations of the bag data structure work quite simply.

Once you create one, pretty much all you can do is add an element, check if an element is in the bag, find the size, and check if the bag is empty.

Here’s what adding would look like in our visual representation —

Adding a new element to the bag

Here’s what our bag might look like after several additions —

The same bag with more elements added

As you can see, these elements don’t keep any order. Unlike strings, for example, bags don’t store elements of data in a sequence.

So there are really only two things bags need to keep track of for the four operations mentioned above to be possible.

Everything a bag implementation needs to keep track of and be able to do

How to implement bags

So now that you know what bags are supposed to keep track of and what they are supposed to be able to do, let’s take a look at how we would go about implementing one.

First things first — we need to keep track of the stuff in the bag, and the size of the bag. As it turns out, a simple array will work fine for internally keeping track of what’s in our bag and we can use an integer for its size.

What we can use to store the stuff we need to keep track of

Now to finish our implementation of the bag data structure, we just have four operations we need to implement.

  • Add an element to the bag
  • Check if an element is in the bag
  • Get the size of the bag
  • Check if the bag is empty

At this point, the implementations of three of these operations should be pretty self-explanatory.

  • To check if an element is in the bag, we can just iterate over our array with a for loop and check if the element matches any of the elements we iterate through.
  • To get the size of the bag, we simply have to return the size variable we are keeping track of.
  • To check if the bag is empty, all we have to do is check if the size variable is 0 or not.

But how do we add an element?

It might seem that all we have to do is insert the element into the array at position size . When size is 0, we insert at position 0. When size is 1, we insert at position 1. (Note that we are assuming that indices start from 0 here; if they were to start from 1 we would insert at position size + 1 )

4 elements inserted into our array

But here’s our problem with this implementation of the add operations — what happens when we try to insert three or more elements here?

The array won’t have space for them. It’s only of size 6. It can only hold data for 6 elements in the bag and no more.

But that’s not how bags are supposed to work! Bags are supposed to allow you to add in as many elements as you want — not just 6.

So how can we solve this? We can’t just have a really huge array and hope that someone using our bag doesn’t need that more space.

A simple solution is to resize our array whenever we run out of space.

Resizing our internal array when its full

We can simply create a new array twice the size (or 4x size or any factor you want) and then copy over everything from the original array to our new resized array.

And that’s really all there is to the bag data structure. 👏 Bags are one of the simplest data structures out there and now you know how to implement them! 👍

(If you want to take a look at a coded implementation for one, here is one I did in forty lines (!) in the Lua programming language)

--

--

Caleb Winston
Index Zero

Building efficient programming systems for cloud and edge computing (calebwin.github.io)