Facts about Strings and Characters in Swift

Shashank Mishra
Mac O’Clock
Published in
4 min readOct 12, 2020

While going through the Swift documentation, I’ve listed down a few interesting and important facts about Strings and Characters. Considering you are aware of the basic terminologies, these facts will definitely add something to your knowledge bucket.

#1. Swift strings can’t be indexed by integer values directly because a single Swift Character might be composed of one, two, or even more Unicode code points.

let string = "With\u{1F496}"
print(string)
print(string.count)
Output
With💖
5
Note: \u{1F496} has been considered as 1 character

String.Index represents the index of characters which you can assume as simple as an integer array indices. Under the hood, each unique String must calculate the indexes of its Characters.

*Sequence of unicode charaters to present a different code. Extended grapheme clusters can be composed of multiple Unicode scalars. This means that different characters — and different representations of the same character — can require different amounts of memory to store.

As 🇺🇸 requires 2 bytes and it is comprised of two Unicode scaler codes(\u{1F1FA} and \u{1F1F8}). However, the length of combined multiple Unicode scaler codes that represent a single extended grapheme cluster must be considered as a single character in Swift.

Example #1 let string = "US flag is 🇺🇸"
print(string.count)
Output
12
Example #2 let string = "US flag is \u{1F1FA}\u{1F1F8}"
print(string.count)
Output
12
Note - Unicode representation of 🇺🇸 is \u{1F1FA}\u{1F1F8}

#2.count property of String can’t always be the same aslength the property of an NSString that contains the same characters.

let string = "With \u{1F496}"
print(string,"\n",string.count)
Output
With💖
5
let string: NSString = "With \u{1F496}"
print(string,"\n",string.length)
Output
With💖
6

It is because the length of an NSString is based on the number of 16-bit code units within the string’s UTF-16 representation and not the number of Unicode extended grapheme clusters within the string.

#3. NSString objects reside in heap and always passed by reference. Whereas, Stringis a value type whenever we pass it(to a function or method) or it gets assigned(to a constant or variable).

String itself a Struct in Swift.

Although Swift’s compiler optimizes string usage so that actual copying takes place only when absolutely necessary. Temporarily it shares the memory of the original string. A copy in memory is created only when there is any change in the copied string.

If you want to access NSString methods on String without casting, import Foundation framework in your class. Swift’s String type is bridged with Foundation’s NSString class.

#3. If you want to include special characters in a string literal without invoking their effect, add string between extended delimiters(#)

Example #1 let string = #"Include special character \"#
Output
Include special character \
Note: Without #, you need to use double backward slash like \\ to
include \ in your string literal.
Example #2 let string = #"Evaluate values in string like \(expression)."#
Output
Evaluate values in string like \(expression).
Example #3 let string #”\(6 * 7) is equal to \#(6 * 7)”#
(mix of everything i.e. if you need to evaluate the value as well)
Output
\(6 * 7) is equal to 42

#4. Multiline String Literals — The whitespace before the closing quotation marks (""") tells Swift what whitespace to ignore before all of the other lines.

You can include a double quotation mark (") inside of a multiline string literal without escaping it. (below example)

To include the text """ in a multiline string, escape at least one of the quotation marks. You can also use extented delimeters(#)

let multilineString = """
First Line "\""
Second Line "
"""
Output
First Line """
Second Line "

#6. Either assign an empty string literal to a variable or initialize a new String instance with initializer, both are the same.

var emptyString = “” // empty string literal                 same asvar anotherEmptyString = String() // initializer syntax

#7. The endIndex property isn’t a valid argument to a string’s subscript because it is the position after the last character in a String, not the last character. If a String is empty, startIndex and endIndex are equal.

let string = "Hello"
string[string.endIndex] // Error
string.index(after: string.endIndex) // Error
let index = string.index(before: string.endIndex)
string[index] // prints o

#9. Substrings( created using a subscript or a method like prefix(_:))aren’t suitable for long-term storage because they reuse the storage of the original string, the entire original string must be kept in memory as long as any of its substrings are being used.

source docs.swift.org

#10. Every comparison of the string or substring is done based on extended grapheme clusters. Two string values are considered equal if they have the same Unicode scalers combinations or canonical equivalence of extended grapheme clusters of each string.

let string1 = "US flag is \u{1F1FA}\u{1F1F8}"
let string2 = "US flag is 🇺🇸"
print(string1 == string2)
Output
true

Extended grapheme clusters are canonically equivalent if they have the same linguistic meaning and appearance, even if they’re composed of different Unicode scalars behind the scenes.

Feel free to add your queries in the response section.

Thanks. Keep Learning :)

Reference: https://docs.swift.org/swift-book

--

--