This part concludes the series “Build Your Text Editor With Rust”. For those who may have stumbled on this page without going through the other parts, here is a summary of what we’ve done so far: We started with “Setup” to reading keypress and entering “raw mode”, to drawing on the screen and moving the cursor around, to displaying text files (making our program a text view), to editing text files and saving changes, to implementing a cool search feature and finally to this part where we add syntax highlighting.
Let’s begin by coloring each digit cyan:
We no longer add the text we want to display right to editor_contents
. Instead, we loop through all the characters and then check if the current character is a digit. If it is, we add the escape sequence for cyan color, add the character and finally, reset the color (You can find what each escape sequence maps to in terms of ANSI codes on wikipedia). The colors that we’ll use are supported by many terminals. If the terminal doesn’t support colors then we just add the digit and ignore the error; let _ = ...
syntax just prevents the compiler from complaining of unhandled errors.
Refactor Syntax Highlighting ✨
Now we know how to color digits, but what we want is to actually highlight entire strings, keywords, comments, and so on. We can’t just decide what color to use based on the class of each character, like we’re doing with digits currently. What we want to do is figure out the highlighting for each row of text before we display it, and then re-highlight a line whenever it gets changed.
The highlighting rules might differ from from one programming language to another. For instance how we might want to highlight the syntax in say a Python file might be different for how we’d like to highlight it if it were a Rust file. But what would like to do is to write a “generic” method for highlighting syntax which could be easily modified or specialized for different programming languages. So we’re going to create a trait
(a.k.a interface in most object oriented programming languages such as Java and Kotlin) to help us implement that.
But even before creating the trait
, let’s add a new field, highlight
, to Row
which would contain the highlight information for that Row
.
Now let’s add the trait
.
For now, the trait has a method which will go through the characters of a Row
and highlight them by setting each value in the highlight
array. Let’s write one implementation of SyntaxHighlight
:
The code is wrapped in a macro
before we implement SyntaxHighlight
. This kind of provides a #[derive]
for that struct, just that we’re using macro_rules
so it’s less complex. The macro has one parameter and that’s the name of the struct
you want to create. The macro then creates the struct and derives SyntaxHighlight
for it.
In the update_syntax
method, we get the Row
we have to update and then loop through the chars
of that Row
and then update Row.highlight
. We add the simple assert_eq!()
to make sure our code works as expected. You can remove it after you’re sure the program works fine.
Next, let’s make an syntax_color()
function that maps values in highlight
to the actual color codes we want to draw them with:
Note that Color
is from crossterm::style::Colors
.
Now let’s finally and the function that would draw the colored text:
The color_row
function is the same as we had in draw_rows
but modified a bit to use the trait
methods instead. Let’s now add SyntaxHighlight
to Output
:
Since we’re using dynamic dispatch, we use the Box<dyn >
syntax. Not every file type might have a syntax highlight method implemented, hence we wrap that the field in Option<T>
. We’re using the RustHighlight
struct
that we created from the macro as the default value for now. In draw_rows
, we replace the coloring code to use syntax_highlight
if it’s available.
To the code working, let’s add update_syntax
everywhere that modifies Row
, similar to how we implemented render_row()
:
Now whenever there’s a change in any of the rows, update_syntax
would be called to reflect the changes. That concludes our refactoring of the syntax highlighting system.
Colorful Search Results 🔭
Before we start highlighting strings and keywords and all that, let’s use our highlighting system to highlight search results. We’ll start by adding SearchMatch
to the HighlightType
enum
, and mapping it to the color blue:
Now all we have to do is set the matched substring’s highlight
to SearchMatch
in our search code:
First, we get a mutable reference to the Row
with the query
. To highlight the matched word, we just loop through the highlight
array, starting from the index
to index
+ query
‘s length, setting the corresponding highlight
value to SearchMatch
Restore Syntax Highlighting After Search ⚖️
Currently, search results stay highlighted in blue even after the user is done using the search feature. We want to restore the highlight
values to its previous value after each search. We’re going to derive Clone
for HighlightType
, similar to what we did for CursorController
. Then we’ll save the original highlight
and the corresponding row_index
in SearchIndex
:
Now let’s modify find_callback
:
At the start of the function we use .take()
to return the owned version of the value in previous_highlight
, if any. take()
replaces the original value with None
, which is what we want in this case. If there was a previous match, we reset that row’s highlight
. When there’s a match we store the previous highlight
before modifying.
Assignment for the reader:
You’d realize that that when you’re on the last match (irrespective of the direction) and you keep pressing the Arrow key that got you to the last match, the color for that row resets. This is the expected behavior. What if we want the color to remain? Before you continue, you should try modifying the program to fix that.
Optimize color_row()
⚡️
So far the coloring works fine but color_row
performs 3 operations which may not be necessary. First, it writes the escape sequence for the color, then it writes a single letter and finally it writes the escape sequence to reset the color. In practice, most characters are going to be the same color as the previous character, so most of the escape sequences are redundant. Let’s keep track of the current text color as we loop through the characters, and only print out an escape sequence when the color changes:
After looping through the chars
we reset the color so that the remaining rows aren’t affected.
Colorful Numbers 🖍
Alright, let’s start working on highlighting numbers properly. First, we’ll change our for
loop in update_syntax
to a while
loop, to allow us to consume multiple characters each iteration:
Now let’s define an is_separator()
function that takes a character and returns true if it’s considered a separator character:
Since many programming languages have similar separators, we make is_separator
a default method.
Right now, numbers are highlighted even if they’re part of an identifier, such as the 32
in int32_t
. To fix that, we’ll require that numbers are preceded by a separator character, which includes whitespace or punctuation characters. Let’s add a previous_separator
variable to update_syntax()
that keeps track of whether the previous character was a separator. Then let’s use it to recognize and highlight numbers properly:
We initialize previous_separator
to true
because we consider the beginning of the line to be a separator. (Otherwise numbers at the very beginning of the line wouldn’t be highlighted.)
previous_highlight
is set to the highlight type of the previous character. To highlight a digit, we now require the previous character to either be a separator, or to also be highlighted with HightlightType::NUMBER
.
When we decide to highlight the current character a certain way, we increment i
to “consume” that character, set previous_separator
to false
to indicate we are in the middle of highlighting something, and then continue
the loop. We will use this pattern for each thing that we highlight.
If we end up at the bottom of the while
loop, we set previous_separator
according to whether the current character is a separator, and we increment i
to consume the character.
Now let’s support highlighting numbers that contain decimal points:
A .
character that comes after a character that we just highlighted as a number will now be considered part of the number.
Detect File Type 🧑🔬
Before we go on to highlight other things, we’re going to add file type detection to our editor. This will allow us to have different rules for how to highlight different types of files. For example, text files wouldn’t have any highlighting, and Rust files should highlight numbers, strings, comments, and many different keywords specific to Rust.
To begin, we’ll add an extensions()
to SyntaxHighlight
so that all types that implement SyntaxHighlight
would have to specify the corresponding file extensions:
After creating the extensions
function, we then implement it in our macro. First, we create a new parameter extensions
,which also takes an expression (expr
), and name it ext
. We then create a struct
with a field named extensions
of type &’static [&’static str]
(We have to explicitly specify the type ‘static
since we’re using it in a struct
field). We also create a new
method to return an instance of the struct
with the specified extensions.
Now let’s create a select_syntax method which would return a SyntaxHighlight
object or None
if there’s no corresponding SyntaxHighlight
for that file extension:
select_syntax
now returns the right SyntaxHighlight
object. To add a new SyntaxHighlight
object, just insert it to the array. Since EditorRows
is the struct
with the filename, we pass mutable syntax_highlight
to EditorRows::new()
so that the it would be modified to return the right SyntaxHighlight
object.
Let’s not forget to update the syntax_highlight
when the user uses SaveAs
:
After giving the file a new name, we have to go through all the rows and then highlight them appropriately.
Next, we’ll show the current file type. Some programming languages such as c could have files with different extensions (e.g .h
and .c
) so we can’t use the extension for the file type. We want the file type to just show which programming language the file belongs to. To do that, let’s add a new method to SyntaxHighlight
which returns the corresponding file type:
Now let’s show the file type, if SyntaxHighlight
has been implemented for it:
If there’s no corresponding SyntaxHighlight
for that file type, we show “no ft” (no file type).
Colorful Strings 📝
Now let’s start highlighting proper. We’ll begin by highlighting strings:
We’re coloring strings green. In some programming languages (such as Rust) the single quote refers to character literal while in others like python, it refers to a string. For now we’ll color the “character literal” dark green. (Later, you can use a boolean or similar to indicate whether or not that distinction should be made) We will use an in_string
variable to keep track of whether we are currently inside a string. If we are, then we’ll keep highlighting the current character as a string until we hit the closing quote:
We actually store either a double-quote ("
) or a single-quote ('
) character as the value of in_string
, so that we know which one closes the string.
So, going through the code from top to bottom: If in_string
is set, then we know the current character can be highlighted with HightlightType::String
. Then we check if the current character is the closing quote (val == c
), and if so, we reset in_string
to None
. Then, since we highlighted the current character, we have to consume it by incrementing i
and continuing out of the current loop iteration. We also set previous_separator
to true
so that if we’re done highlighting the string, the closing quote is considered a separator.
If we’re not currently in a string, then we have to check if we’re at the beginning of one by checking for a double- or single-quote. If we are, we store the quote in in_string
, highlight it with HightlightType::String
, and consume it.
Usually when the sequence \'
or \"
occurs in a string, then the escaped quote doesn’t close the string in the vast majority of languages. For instance, in the line:
else if c == '"' || c == '\'' {
If we’re in a string and the current character is a backslash (\
), and there’s at least one more character in that line that comes after the backslash, then we highlight the character that comes after the backslash with HightlightType::String
and consume it. We increase i
by 2
to consume both characters at once.
Colorful Single Line Comments 📎
Next let’s highlight single-line comments. We’ll give comments a dark grey color:
We’ll let each language specify its own single-line comment pattern, as they differ a lot between languages. Let’s create a comment_start
field and method:
Now to the highlighting code:
Perhaps an empty string was passed as comment_start
, so in that case we won’t highlight any comments. We wrap our comment highlighting code in an if
statement that makes sure we’re not in a string, since we’re placing this code above the string highlighting code (order matters in this function). We use .as_bytes
method since render
too returns the bytes of that row.
We then check if this character is the start of a comment. If so we give the rest of the line a HighlightType::Comment
setting and then we break out the loop; The range i..i+ comment_start.len()
spans chars
that is the exact len
of the comment_start
. But since i + comment_start.len()
can overflow we use cmp::min
to prevent that.
Colorful Keywords 🔑
Now let’s turn to highlighting keywords. We’re going to allow languages to specify arbitrary number of keywords and their corresponding colors. For now we’re going to have only 3 types of key words but you can always expand it. We’ll highlight actual keywords in one color and common type names in the other color and keywords that go with macro_rules
in another color:
What we simply do in the macro is to add a new field named keywords
which could contain an arbitrary number of items such that each item should first have a color specified, followed by ;
and then the keywords we want to color with the color specified. We also added &
as a separator so strings like &str
and &mut self
and signatures would be rendered properly. (Also added []
)
Note how using a macro we’re able to write our own syntax to build our struct
Let’s modify our HighlightType
enum so we can pass a color directly:
Now let’s highlight them:
Keywords require a separator both before and after the keyword. Otherwise, the match
in rematch
, matching
, or matched
would be highlighted as a keyword, which is definitely a problem we want to avoid. So we check previous_separator
to make sure a separator came before the keyword, before looping through each possible keyword. Recall that keywords
could contain any any number of color, arbitrary number of keywords pair. So we expand the macro ($( )*
) to get each color, arbitrary number of keywords pair. We then expand the macro again so we can now operate on each keyword. Note that the color is still available in this scope.
Similar to how we implemented commenting, for each keyword we determine whether indexing it into render
would cause an overflow. We do that check before comparing whether that character is the start of a keyword. We also check whether the character after the keyword is a separator. If the keyword is the last word on the line, we consider it as a keyword. Next, we highlight the whole keyword. After highlighting, we increase i
by the keyword’s length, set previous_separator
and then continue
the loop.
Colorful Multi-line Comments 🖇
We have one last feature to implement: multi-line comment highlighting. We’ll color multi-line comments with the same color as single line comments. We’ll let each language specify a multi line comment start and end. In Rust these are /*
and */
respectively.
We use a tuple to hold the information about the start and end of multi line comments. The first value refers to the start and the last value refers to the end.
Now for the highlighting code. We won’t worry about multiple lines just yet.
First we add an in_comment
boolean variable to keep track of whether we’re currently inside a multi-line comment. Moving down into the while
loop, we check to make sure we’re not in a string, because having /*
inside a string doesn’t start a comment in virtually all languages. If we’re currently in a multi-line comment, then we can safely highlight the current character. We then check whether we’re at the end of the comment. If we’re, then we highlight the remaining string which indicates the comment’s end. If we’re not at the end of the comment, we simply consume the current character which we already highlighted.
If we’re not currently in a multi-line comment, then we check if we’re at the beginning of a multi-line comment. If so, we highlight the whole multi line comment start string, set in_comment
to true, and then increase i
appropriately.
Now let’s fix a bit of a complication that multi-line comments add: single-line comments should not be recognized inside multi-line comments
Now let’s work on highlighting multi-line comments that actually span over multiple lines. To do this, we need to know if the previous line is part of an un-closed multi-line comment. Let’s add an is_comment
boolean variable to the Row
struct
:
Now, the final step:
We now assign in_comment
before current_row
since in_comment
borrows editor_rows
immutable but current_row
borrows it mutably and we can’t borrow a mutable variable as immutable while it’s been used. We initialize in_comment
to true if the previous row has an un-closed multi-line comment. If that’s the case, then the current row will start out being highlighted as a multi-line comment.
At the bottom of update_syntax()
, we set the value of the current row’s is_comment
to whatever state in_comment
got left in after processing the entire row. That tells us whether the row ended as an un-closed multi-line comment or not. Then we have to consider updating the syntax of the next lines in the file. So far, we have only been updating the syntax of a line when the user changes that specific line. But with multi-line comments, a user could comment out an entire file just by changing one line. So it seems like we need to update the syntax of all the lines following the current line. However, we know the highlighting of the next line will not change if the value of this line’s is_comment
did not change. So we check if it changed, and only call update_syntax()
on the next line if is_comment
changed (and if there is a next line in the file). Because update_syntax
keeps calling itself with the next line, the change will continue to propagate to more and more lines until one of them is unchanged, at which point we know that all the lines after that one must be unchanged as well.
That’s It! 🎊🎉
Our text editor is finished. You can find the repository here. You can create an issue if you have any questions, or if you detected a typo or so. You can also email me. If you’d also like to support me, you can donate using this link and suggest more cool tutorials you’d like to have.
I’ll also release some more tutorial like these so if you haven’t already followed, you probably should!
We can link up on Upwork to work together or if you’d like more assistance in your of become a proficient software engineer.
Ideas For Features To Implement On Your Own
You can add some more features that you’d like. May I suggest some.
If you want to extend the program on your own, I suggest trying to actually use it as your text editor for a while. You will very quickly become painfully aware of all sorts of features you’re used to having in a text editor, but are missing in pound
. Those are the features you should try to add.
Here are some ideas you could try, roughly in order of increasing difficulty:
- More File Types: Add more syntax rules for various programming languages. You can simply do this by using the
syntax_struct!
macro and don’t forget to include itOutput::select_syntax()
. - Line Numbers: Display the line number to the left of each line of the file.
- Soft indent: If you like using spaces instead of tabs, make the Tab key insert spaces instead of
\t
. - Auto indent: When starting a new line, indent it to the same level as the previous line.
- Soft-wrap lines: When a line is longer than the screen width, use multiple lines on the screen to display it instead of horizontal scrolling.
- Copy and paste: Give the user a way to select text, and then copy the selected text when they press
Ctrl-C
, and let them paste the copied text when they pressCtrl-V
. - Mouse Control: Allow the user to scroll through the file using the mouse. You can also use this to prevent the user from scrolling out of the file when scrolling down. You can also allow the user to insert characters where the last left mouse button click occurred. You can find a head start here.
- More Coloring: Other expressions such as function names, various “types” such as
String
,Option
and so on, along withmacro_rules
and various macros could use some highlighting. Currently, “lifetimes” aren’t highlighted well so you might also want to fix that. - Config file: Have the editor read a config file (maybe named
.{opened_file}.pc
) to set options that are currently constants, likeTAB_STOP
andQUIT_TIMES
. Try to make more things configurable. - Multiple buffers: Allow having multiple files open at once, and have some way of switching between them.
- UTF-8: This might seem as though it isn’t a big issue but as the Rust book says, strings aren’t so simple. Currently, invalid UTF-8 characters would crash the editor. Also, characters which are more than 1 byte (such as emojis) would cause the editor to crash when scrolling through those characters. You could use iterators to such as
chars(). skip(). take()
instead of direct indexing (which panics). This also includes modifying the editing features of the program to insert and delete characters without panicking, along as fixing the cursor position in the presence of these multi bytes characters.