Swift: Strings and Characters

TL;DR : NSString has been replaced by String and Character

Once you’ve made the jump to Swift you will quickly find that many of the common types and classes that you used with Objective-C are very different. For any programming language the most important of the all types is going to the one which represents a string. The basic structure of a string is an array of characters. These arrays were simply integer values which lined up with an ASCII character code. There would be 26 characters for lowercase letters, another 26 for uppercase, several for whitespace and punctuation and that is all you needed. Altogether the ASCII table is made up of 255 characters. It is quite compact and 7 bits was all you needed to represent all of the characters.

Eventually more characters were needed. The initial character set included non-English characters to support some languages but not all languages. In the late 80’s and early 90’s work on a unicode standard started to define a unicode character table which increased the size of the character representation of 7 bits with ASCII to 16 bits. It would no longer be possible to represent character with short integers and strings would also have to change.

Recently the supported character set on modern computers and mobile devices has expanded to support emojis with more being defined as a part of a large collection. When email and text messaging became mainstream people simply invented emoticons to relay a simple emotion or an abbreviation like LOL or LMAO. Now it is clear that most people enjoy using emojis for the same purpose with a bit more personality. And generally we don’t have to think about it until it is necessary to dig into the details with strings.

Being a modern language, Swift supports unicode and emojis as a natural part of the language. Moving away from NSString it can be difficult to adapt to the new construct. Instead of a simple array of numerical lookup codes individual characters in a Swift string are made traversable with a doubly linked list of nodes. Each node is associated with an index which can be used to access a character. It may seem complicated. A simple example can illustrate how it works.

The Gist above shows how to traverse the characters of the new String class in Swift. A string has a couple of properties named startIndex and endIndex which represent each end of the doubly linked list which connects all of the characters. The value for the index can be advanced forward with the index(_, offsetBy:) function with an offset value until reaching the endIndex. It is important to know that the final index cannot be used to access a character. With classic C strings a null terminator was used to indicate the last character in an array. The endIndex property appears to work in a similar way.

Alphabets and Emojis

It may appear that Swift has made strings more complex but what is really being done is including support for characters which are represented very differently. The Gist below shows in more detail how very different the String and Character classes work in Swift. For ASCII characters it is still simple but with emojis you will find that a single “character” is actually represented by a sequence of integer values. Place the code below into a Swift Playground and see the output for yourself.

As you experiment with the code in the Playground you will find there are new techniques for working with strings and characters which are quite different from NSString and C strings.

Next: Unowned Properties

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.