7 of my favourite simple Sublime Text features for finding and extracting text
Sublime Text is best known as the text editor for professional programmers. And no wonder, as it has so many features designed to speed up writing, analysing and editing code. And most plug-ins for this application are created by developers for developers.
But in my opinion, those who work in Open Source Intelligence should take a closer look at this text editor. So does anyone who regularly handles CSV, JSON, XML and other data files.
It has a lot of features for professional work with text data. In many (though not all) cases it can replace command line utilities such as grep, csvgrep, jq, sed, awk, head, tail etc.
Besides, there is a huge number of user-created plugins, packages, macros for Sublime Text, which allow you to automate an incredible number of different tasks.
It also works very well with files that have several hundred thousand or more lines. For example, last week I tried to open a table of 1 million 300 thousand lines in Excel, but it opened incompletely and displayed a warning that the file was too big. Sublime Text handled it without any problems.
In this article I will tell you about the simplest Sublime Text tricks which will help you to save your time and to realize how good this editor is.
But if for some reason you don’t like Sublime Text, you can find some of its features in analogues. For example, Notepad++ also supports macros and regular expression searches.
During my work I usually use a version of Sublime Text 3211 with custom appearance settings and plugins. But for this article I additionally installed version 4126 (unregistered) with standard settings to make it easier for the reader to replicate what is shown in the screenshots.
1. Find All
Press Ctrl (Command)+F.
Enter any word in the search box.
Enter any word that appears more than once in the text into the search box.
DO NOT change any settings (try to always check before searching if you have accidentally pressed any buttons next to the search field, this is important)
Click Find All
Now the most important part:
Press Ctrl (Command) + C
Create new empty file (Ctrl (Command) + N)
Press Ctrl (Command) + V
We have just learned the most important and basic text extraction technique in Sublime Text. In the following paragraphs we will repeat it again and learn how to use it more usefully than getting a list of 500 lines of “@hotmail.com”
2. Selection expanding
So, we’re doing it again.
Ctrl (Command) +F -> word in search box -> Click Find All.
And now for the most important part:
Press Ctrl (Command) + L
This will help to highlight not just fragments of text, but whole lines in which they occur.
Then, as in the example above, simply copy and paste them into a new text file.
3. Searching using regular expressions
In this section we repeat the same as the first two, but press the “.*” button and enter the following sequence of characters into the field:
[-_a-zA-Z0–9.+!%]*@[-_a-zA-Z0–9.]*
This should find all the email addresses in the document.
Similarly, you can search for hyperlinks, html tags, domain names, IP addresses, telephone numbers, postal codes and much more with regex.
If you don’t know how to use regular expressions, you can read this article and learn the basics in 15 minutes.
When you search for and compose regular expressions for Sublime Text, remember that the editor supports PCRE syntax (Perl Compatible Regular Expressions).
4. Removing duplicates
Press F5 or click Edit -> Sort Lines
Click Edit -> Permute Lines -> Unique.
In my experience, CSV/JSON data file has duplicates very often (especially leaked files). And it’s worth remembering to run this function before working with any table.
5. Adding line breaks, tabs and other invisible characters
Sometimes it happens that you have to work with a file that is extremely difficult to understand. For example, all the lines are merged into one, or there is no tabulation in the table. The regular expressions mentioned in the third section can help.
Click Find -> Replace
Add \ | in Find field.
Add |\n in Replace field.
Click Replace All.
Each cell in the table now starts with a new row.
This example is not very useful, but I think it’s very clear. Other “invisible” characters can be added to a text document in the same way.
For example:
\t — tabulation
\r — carriage return
\v — vertical tab
\A — start of line
\Z — end of line
6. Find in files
Click Find -> Find in Files
Enter any word in Find: field
Select the files or folders you want to search in (by default, the search is in those currently open)
Click Find
This will create a new file, indicating in which files and in which lines the word or regular expression you were looking for.
7. Recording macros
Sublime Text has the ability to record user actions in the same way as MS Office and many other applications.
But unfortunately macros in Sublime Text don’t record Find/Replace methods (despite all user complaints) and this point doesn’t really fit in this article. Still, the feature itself is very interesting.
Click Tools -> Record Macro
Do anything (for example, simply insert a few blank lines somewhere)
Click Tools -> Stop recording macro
Click Tools -> Save macro
After saving the macro, you can open it as a text file, edit it as you wish or add additional commands.
Sublime Text Unofficial Documentation Commands Reference:
You can run recorded macros by clicking on Tools -> Macros -> User -> macros name.
Before you create any complex macros, check if someone has already created a plugin or macros with similar functionality. You can check this at https://packagecontrol.io/ (+ additionally on StackOverflow).
That’s the end of the article. It only shows a few of the simplest tricks, but I hope you’ve realised the power of Sublime Text. This text editor has a huge number of features which, with persistence and imagination, can be used to do absolutely amazing things.
Thank you so much for visiting my blog and reading the rest of this article!