Strings in Python: From Basics to Advanced Techniques
Welcome back to my introduction to Python written course! The previous article covered for
and while
loops, compound assignment operators, and recursion in Python.
Strings are an essential data type in any programming language, and Python is no exception. This article will dive deep into Python strings covering various topics, such as creating strings, slicing strings, string formatting, and manipulation methods. It is suitable for readers new to programming and those with some experience wanting to learn more about working with strings.
What are Strings in Python?
A string is a sequence of characters enclosed in quotation marks. You can use either single quotes or double quotes to create a string. For example:
first_string = 'Hello, World!'
second_string = "Hello, World!"
It is a good idea to be consistent in your code. Use single quotes for the rest of your code if you choose single quotes. The same goes for double quotes. If you want to read more about code consistency, the following article explores this topic in more detail.
The Parts of a String
A string in Python can be broken down into several parts. Here are some of the main parts of a string:
Characters
A string is made up of individual characters, the basic building blocks of a string. Characters can be letters, digits, symbols, or any printable or non-printable ASCII character.
Substrings
A substring is a portion of a string. It can be as small as a single character or as large as the entire string. You can extract a substring from a string using string slicing or string methods such as find()
or index()
.
Here are examples of slicing a substring, find()
method, and using the index()
method:
# Create a string
text = "Hello, World!"
# Slice a substring from the string
substring = text[7:12] # Output: World
# Find the index of a substring in the string
substring_index = text.find("World") # Output: 7
# Find the index of a substring in the string using index()
substring_index = text.index("World") # Output: 7
# Use string slicing with substring_index to extract the substring
substring = text[substring_index:substring_index + 5]
In these examples, we create a string called text
and then extract a substring from it using string slicing or find()
and index
methods to locate the index of a substring within the string. Both the find()
and index()
methods return the index of the first occurrence of the specified substring in the string.
The difference between both methods is that the index()
function will raise a ValueError
exception if the specified substring is not found in the string, while the find()
function will return -1 in this case.
Word
A word is a group of characters that form a unit of meaning. In Python, you can split a string into a list of words using the split()
method, which separates the words based on a specified delimiter.
Here is an example of splitting a string into a list of words in Python:
# Create a string
text = "Hello, World! How are you today?"
# Split the string into a list of words
words = text.split(" ")
print(words) # Output: ['Hello,', 'World!', 'How', 'are', 'you', 'today?']
You can specify any delimiter to use when splitting the string. For example, to split the sentence into a list of words based on commas, you can use the following code:
words = text.split(",")
print(words) # Output: ['Hello', ' World! How are you today?']
Lines
A line is a group of characters that ends with a newline character (\n). We can write a multiline string using (\n) inside a string:
text = "Hello,\nWorld!\nHow are you today?"
print(text) # Output:
# Hello,
# World!
# How are you today?
A multiline string can be created using triple quotes (either single or double) to enclose the string. This allows you to write long strings that span multiple lines without using explicit newline characters (\n).
Here is an example of creating a multiline string:
text = """Hello,
World!
How are you today?"""
print(text) # Output:
# Hello,
# World!
# How are you today?
Note that both outputs are the same. The newline characters are not explicitly written in the string but are implied by the triple quotes and are automatically inserted between the lines of text.
In Python, you can split a string into a list of lines using the splitlines()
function, which separates the lines based on newline characters.
Here is an example of splitlines()
:
# Split the string into a list of lines
lines = text.splitlines()
print(lines) # Output: ['Hello,', 'World!', 'How are you today?']
Special Characters in Strings
In Python, you can use special characters to represent characters with a special meaning. To include a special character in a string, you can use an escape sequence, which is a combination of the backslash (\
) character and another character. The backslash tells Python to treat the character that follows it as a special character.
Here are some common escape sequences in Python:
\n
: represents a newline character. When used in a string, it causes the string to be printed on multiple lines.\t
: represents a tab character. When used in a string, it causes the string to be indented by one tab.\\
: represents a backslash character. When used in a string, it causes a single backslash to be included in the string.\'
: represents a single quote character. When used in a string, it allows you to include a single quote character within a string enclosed in single quotes.\"
: represents a double quote character. When used in a string, it allows you to include a double quote character within a string enclosed in double quotes.
Here is an example of using escape sequences in strings in Python:
# Create a string with newline and tab characters
text = 'First line\n\tSecond line'
print(text)
# Output:
# First line
# Second line
# Create a string with a backslash character
text = 'This is a backslash: \\'
print(text) # Output: This is a backslash: \
# Create a string with single quote characters
string = 'This is a single quote: \''
print(string) # Output: This is a single quote: '
# Create a string with double quote characters
string = "This is a double quote: \""
print(string) # Output: This is a double quote: "
Accessing Characters in a String
You can access individual characters in a string using indexing. In Python, indexing starts at 0, so the first character in a string is at index 0, the second character is at index 1, and so on.
To access the first character of a string, you would use the following code:
name = "Phil"
first_char = name[0]
print(first_char) # Output: P
You can also use negative indexing to access characters from the end of the string:
last_char = name[-1]
print(last_char) # Output: l
Slicing Strings
In addition to accessing individual characters, you can also slice a string to get a substring. To slice a string, you can use the following syntax:
string[start:end:step]
Here, start
is the index of the first character you want to include in the slice, end
is the index of the character after the last character you want to include in the slice, and step
is the number of characters to skip between each character in the slice.
The default values for the start
, end
, and step
parameters in string slicing are 0, the length of the string, and 1, respectively. This means that if you omit the start
parameter, Python will assume a default value of 0, which corresponds to the first character in the string. Similarly, if you omit the end
parameter, Python will assume a default value equal to the length of the string, which corresponds to the last character in the string. If you omit the step
parameter, Python will assume a default value of 1, which means that the slice will include every character in the string.
For example, to get the first five characters of the string text
, we can use the following code:
text = "Hello, World!"
first_five = text[0:5] # You could use the omited [:5]
print(first_five) # Output: Hello
We can also use negative indexes with slicing from the end of the string. For example, to get the last six characters of the string text
, we can use the following code:
last_six = text[-6:] # You could use the non ommited [-6:-1] or [-6:13]
print(last_six) # Output: World!
Here is an example of using step
in string slicing:
substring = text[::2]
print(substring) # Output: 'Hlo ol!'
substring = text[::-1]
print(substring) # Output: '!dlroW ,olleH'
In the first example, we use string slicing to extract every other character from the string by specifying a step of 2. The resulting substring is "Hlo ol!"
.
In the second example, we use string slicing to extract all characters from the string in reverse order by specifying a step of -1. The resulting substring is "!dlroW ,olleH"
.
String Formatting
Python provides several ways to format strings, including string interpolation, f-strings, and the format()
method.
String interpolation
A way to include the value of a variable in a string. In Python, you can use the %
operator to interpolate values into a string.
To use string interpolation, you can use the %
operator followed by a format specifier and a value or expression. The format specifier specifies the type of the value or expression, such as %d
for an integer, %f
for float, or %s
for a string.
Here is an example of using string interpolation in Python:
name = "Phil"
age = 56
text = "My name is %s and I am %d years old" % (name, age)
print(text) # Output: My name is Phil and I am 56 years old
Format
The format()
method is another way to format strings in Python. It allows you to specify placeholders for values in the string and then pass the values as arguments to the format()
method. For example:
text = "My name is {} and I am {} years old".format(name, age)
print(text) # Output: My name is Phil and I am 56 years old
F-strings
F-strings, introduced in Python 3.6, provides a more concise way to interpolate values into a string. To do this, you can include curly braces ({}
) with the string to replace it and precede it with the f
prefix.
Here is an example using f-string
:
text = f"My name is {name} and I am {age} years old"
print(text) # Output: My name is Phil and I am 56 years old
I recommend using f-string
to format your strings in general. They are more concise, easier to read, and are compiled at runtime, meaning they are generally faster than the %
operator, which requires multiple function calls to format the string.
You can also use F-strings to perform basic arithmetic operations directly in the string. Here is an example of using f-strings
with arithmetic operations:
# Create two variables
x = 10
y = 5
# Use an f-string to add the variables and print the result
print(f"The sum of x and y is {x + y}") # Output: 'The sum of x and y is 15'
Modifying Strings
Python's strings are immutable, meaning you cannot modify a string directly. This means that you cannot change the value of an individual character in a string and delete or insert characters into a string.
If you try to assign a new value to a string, Python will create a new string with the new value rather than modifying the original string. For example, consider the following code:
# Create a string
text = 'Hello, World!'
# Attempt to change the value of the first character in the string
text[0] = 'H' # This will raise a TypeError
However, you can create a new string based on an existing string by using string slicing and concatenation. For example, to add a new character to the beginning of a string, you can use the following code:
new_string = "J" + text[1:]
print(new_string) # Output: Jello, World!
You can also use string formatting to create a new string based on an existing string:
new_string = f"{text} Welcome!"
print(new_string) # Output: Hello, World! Welcome!
String Methods
Python provides several built-in methods for working with strings. Some of the most commonly used string methods are:
upper()
: Converts all characters in a string to uppercaselower()
: Converts all characters in a string to lowercasestrip()
: Removes leading and trailing whitespace from a stringreplace()
: Replaces all occurrences of a specified string with another stringsplit()
: Splits a string into a list of sub-strings based on a specified delimiterjoin()
: Joins a list of strings into a single string
Here are some examples of using these methods:
text = " Hello, World! "
# Convert to uppercase
uppercase = text.upper()
print(uppercase) # Output: HELLO, WORLD!
# Convert to lowercase
lowercase = text.lower()
print(lowercase) # Output: hello, world!
# Strip leading and trailing whitespace
stripped = text.strip()
print(stripped) # Output: Hello, World!
# Replace occurrences of 'Hello' with 'Hi'
replaced = text.replace('Hello', 'Hi')
print(replaced) # Output: Hi, World!
# Split the string into a list of words
split_string = text.split()
print(split_string) # Output: ['Hello,', 'World!']
# Join a list of strings into a single string
words = ["Hello,", "World!"]
joined_string = " ".join(words)
print(joined_string) # Output: Hello, World!
Strings are an essential data type in any programming language. They are used in many applications, from simple text processing to more complex tasks such as natural language processing.
In this article, we learned about the basics of strings in Python, including how to create and manipulate strings, what special characters are, and use various methods and formatting techniques to work with strings.
In the following article, we’ll cover Python’s most commonly used data structures and how to use them.
Stay tuned, and let’s code!
If you’re interested in reading other articles written by me. Check out my repo with all articles I’ve written so far, separated by categories.
Thanks for reading