Strings

Christopher Pitt
7 min readAug 15, 2014

--

Strings are hard for new programmers to understand. Try explaining to a new programmer the difference between words that tell the program what to do, and words the program does things with. Often these concepts interact, and there are many nuanced ways in which strings can be shown and changed….

Strings vs. Keywords vs. Variables

Programming languages are built on the ideas of keywords and variables. Keywords are used to form instructions for the parser/compiler. Imagine a program as an application form. Keywords can be compared to the printed labels (like “Name:”, “Age:” and “Do you want free stuff?”). They are pre-defined, reserved words from which whole programs are constructed.

On that same form; the blank lines (next to the labels) can be compared to variables. Variables are placeholders/references to data that changes depending on environment, context etc. When someone reads a form you’ve completed, they look inside the variables (at the space above the blank lines) and see what contextual data is held in the variables.

Finally, the information you put above those blank lines can be strings or numbers (integer/float) or booleans (yes/no). We’re only going to talk about strings, but variables can hold any of these kinds of information.

When you are completing the form, you are like a parser/compiler. You read the form from top to bottom, seeing keywords (“What is your name?”), locating the variables to store information in (looking for the blank line) and storing strings/numbers/booleans in those variables (writing a series of characters for your name, an integer for your age and a tick/cross for whether you want free stuff).

When someone wants to read the form; they are the parser/compiler. They read the same instructions/keywords, locate the same lines/variables and read the same information.

When a point of that information is a series of characters, we call it a string. When it’s a series of numbers then it could be a string, or it could be a number. Numeric characters are still characters, so it just depends on the context. That is unless the form/keywords/programming language has been specific about the kind of data it wants in that variable.

This is the difference between dynamic-typed languages and static-typed languages. Dynamic-typed languages allow any kind of data in their variables, and this type can change at any point. Static-typed languages require that variables are given a fixed type, and these languages will squawk when the programmer tries to change the type.

Combining Two Strings

There are a number of ways to combine strings, in PHP. They have slight differences, but their use is often the result of personal preference.

Concatenation

Concatenation is just another word for combining two strings together by pushing the second string on to the end of the first string, to produce a new string:

php > $hello = "hello";
php > $world = "world";
php > echo $hello . $world;

helloworld

This example not only shows a simple use of variables — it demonstrates the use of . to concatenate two strings. This operation is sometimes done using + in other languages. You can combine as many strings as you wish, by chaining . operations:

php > echo "hello" . " " . "world";

hello world

Variables are set apart from other words because they start with $. If you forget to add it, PHP will assume you meant to type a string of the same name as the variable you wanted to use. You’ll also get a nasty warning message, so I don’t recommend you use this approach to avoid $.

Interpolation

Interpolation is just another word for making a string by substituting placeholders for the contents of variables:

php > $hello = "hello";
php > $world = "world";
php > echo "{$hello} {$world}";

hello world

In this example, we’re setting two variables and replacing their placeholders in the final string. The curly braces aren’t strictly required, but I recommend using them. There’s no confusing your intention there. Consider the alternative:

php > echo "The price is $$price. That's a lot of $";

The price is $4.99. That's a lot of

We’ll see why that last $ was left out, but your intention would be much clearer with the curly braces around $price

Implicit Casting

Casting is a word used to describe the process of converting a variable from one type to another. Implicit casting is casting that happens automatically (and is usually only present in dynamic-typed programming languages).

Sometimes you want to compare the information inside a string variable with the data that’s inside a number variable. Perhaps you know one of the variables is in the correct format or perhaps you want to make a new string from similar to the one I did in the previous example…

Well, in the previous example one of the variables was a float but it was used to make a new string. We can see something similar happen here:

php > echo "The price is: " . 4.99;

The price is: 4.99

In this example (and the previous one), PHP figured out that you wanted to make a string, using a float, and cast it implicitly. This works both ways:

php > echo (4.99 > "3.99");

1

In this case PHP saw that you wanted to see if the value of the string was less (numerically) than 4.99. So it implicitly cast the string to a number, and did the comparison. It did another implicit cast to get from true (the result of the comparison) to a string for echo to output.

There are many rules to learn, before being able to fully understand when PHP will implicitly cast, but the important thing to know is that it could. This can be a problem, especially considering the following situation:

php > echo ("false" == true) ? "true" : "false" ;

true

In this case, the string false is converted to a boolean; which results in true. A tricky business if ever there was. Fortunately there’s a simple rule you can follow (when comparing strings, numbers and booleans):

php > echo ("false" === true) ? "true" : "false" ;

false

Using === will tell PHP to check both the type and the value of something it’s comparing. The string false is not the same type as the boolean true, so the test passes before implicit casting has a chance to confuse the issue.

(condition) ? true : false is just shorthand for an if statement. true and false will be returned, based on the outcome of condition. The brackets are also optional, but recommended.

Explicit Casting

If you’re in a situation where you would rather change the type of the variable yourself, you can use what’s called explicit casting. This is particularly useful if a function expects a certain type, which you then case a variable to:

php > echo is_string(1.5) ? "yes" : "no";

no

php > echo is_string((string) 1.5) ? "yes" : "no";

yes

A good, general rule is to use explicit casting whenever you aren’t sure what type of data you are working with.

Class Strings

Classes are probably a little outside of the scope of this post, and if you’ve not learned about them then you should probably just skip this part.

PHP knows how to implicitly cast numbers, strings and booleans (also other types which we’ve not talked about). When it comes to classes/objects, PHP often does not know how to cast them to strings:

php > $class = new stdClass();
php > echo "{$class}";

PHP Catchable fatal error: Object of class stdClass could not be converted to string in php shell code on line 1
PHP Stack trace:
PHP 1. {main}() php shell code:0
PHP 2. {main}() php shell code:0

Catchable fatal error: Object of class stdClass could not be converted to string in php shell code on line 1

Call Stack:
270.1352 223656 1. {main}() php shell code:0
481.0499 223688 2. {main}() php shell code:0

You can help it along by making a special __toString() method on your classes:

php > class Thing {
public function __toString() { return "this is a Thing"; }
}

php > $thing = new Thing();
php > echo "{$thing}";

this is a Thing

Alternatively, you can use a built-in function to get some idea of what a thing is:

php > class AnotherThing {}
php > $thing = new AnotherThing();
php > echo serialize($thing);

O:5:"AnotherThing":0:{}

It’s not pretty, but it’s useful.

There is so much to learn about what the serialize() function is doing, but it’s worth a whole post.

Quotes

People often ask why strings can have double-quotes or single-quotes, and what the differences are. Here is the biggest reason: single-quotes are for literal strings and double quotes are for interpolated strings.

Firstly, literal strings are strings where every character means just that character. There is no special meaning or replacement behaviour in literal strings:

php > $hello = "hello";
php > echo '{$hello}, I am a literal string.';

{$hello}, I am a literal string.

If you are using single quotes, concatenation is the only way yo are going to get new characters in that string. Secondly, interpolated strings are good for replacing placeholders with variable information, but that’s not all the are good for:

php > $hello = "hello";
php > echo "{$hello}\n\nI am an interpolated string.";

hello

I am an interpolated string.

Aside from allowing that $hello variable in there, we can also use \n to add newlines. This is one of many (though arguably the most used) character combinations; called control characters. These won’t work in literal strings:

php > $hello = "hello";
php > echo '{$hello}\n\nI am a literal string.';

{$hello}\n\nI am a literal string.

What happens when you don’t want things to be interpolated within double-quotes? You can escape this behaviour with extra \ characters:

php > $hello = "hello";
php > echo "{\$hello}\\n\\nI am an escaped string.";

{$hello}\n\nI am an escaped string.

There is no practical speed difference between using single-quotes or double-quotes. A single database query, or HTTP request or file read will blow execution time out the water. Switching from double-quotes to single-quotes will not save the day.

--

--