Write your own C++ STL string class
C++ is an interesting language with several subtle features that are good to know and can help in understanding the underlying implementation of your favorite STL classes. One of the most common classes used from STL is the ‘string’. You would have certainly used #include <string>, but have you pondered on what it takes to write your own string class that works just like the STL string?
So, let us write our own string class. First and foremost, we need our class body. I am gonna call it my_string. Then, it must have a char buffer to hold our data and a size variable to hold the length of the string. As we proceed with our requirements, we’ll evolve our my_string class. So it looks like this:-
Now, we need a few more functions to support different ways to create a new my_string object. Below are a normal constructor, a copy constructor and a copy assignment operator function:-
Cool, our class now supports creating an object by passing a char buffer or an object of similar type, and also the assignment of a similar type object.
While these constructors and functions are good to support minimum functionality to use a string class, there are some scenarios in which copying the buffer may not be a good idea especially when the buffer is large and the object or the buffer to be copied from may not live after its current scope. E.g. let’s say we have a function that creates a my_string object with some big content in its character buffer and returns it. The collector my_string variable will need to copy it from the returned object and the returned object will die as the scope of the called function is over.
my_string getAString()
{
my_string str("this is a very very long string");
return str;
} // object named 'str' will die here as the scope endsmy_string a = getAString();
// this will copy from the returned object
This is where we can leverage the move semantics that got introduced from C++11 onwards. Move semantics provide a way to move data between objects without copying. You can read more about move semantics here and here. So, we can have a move constructor and a move assignment operator function like this:-
Notice the double ampersand ‘&&’ and the name of the object ‘dyingObj’. The name ‘dyingObj’ is intentional, to indicate that the passed object is about to die as the scope in which it was created is ending and that it will be no longer be needed following the move.
Now, we can add concatenation functionality by overriding ‘+’ operator. See below:-
A concatenated string named ‘s’ is created it will die as soon as the function scope ends. Since our class supports move semantics, the returned object will be passed to move constructor or move assignment and the content of ‘s’ can be moved to the collecting object without copying.
my_string a("FirstName");
my_string b("LastName");
my_string c = a + b;
As you can see, inside the concatenation function, a new my_string object ‘s’ is created that will copy the contents of objects ‘a’ and ‘b’ to itself. In the absence of a move constructor, the contents ‘a’ and ‘b’ would have been first copied to the new object ‘s’ and after the concatenation function returned, the entire content of ‘s’ would need to be copied to the collecting object ‘c’. So, it would have been a double-copy which is not optimal w.r.t. performance. Also, we couldn’t simply return ‘s’ because it would die as soon as the function exits and we would need to create and return it as a dynamically allocated object (long-lived object) and we would also need to take care of the deallocation of that memory; well there would have been lot of things to worry about. Such is the power of ‘move semantics’.
Below is the complete implementation of the my_string class. I have overridden the ostream operator to support outputting our string to console using std::cout.
P/S: This is not a thorough implementation of the string class. This is just intended to educate readers on how to write a string class on their own and utilize the move semantics of C++. The reader may add other functions like index-based access to this class.