General Bash Usage — 2 (Grep-Awk)

Cemre Acar
CARBON CONSULTING
5 min readJan 18, 2022

--

In the previous, I published an article about basic shell programming based on general bash usage. In this article, I will talk about Regular Expressions to support the use of bash. Regular expressions play an important role in computer science applications. For example, in apps that contain text, users may want to search for lines that match certain things. Regex provides us with a powerful method for this, and utilities and text editors such as grep, awk provide mechanisms for definitions of patterns using regular expressions. In this article, I’ll show you how to use regular expressions with grep and awk.

What is Grep? (Global Regular Expression Print)

If we explain the logic briefly, it allows us to select the word you want in a text or to mark it.

Basic Operations with Grep

Let’s have a .txt file and I’ll list the places where the word “linux” is mentioned in this file. For this, it will be enough to give the following command.

grep “linux” test.txt

With this command, the lines containing the word “linux” were printed on the screen. Well, we learned the process of choosing the word. Let’s look at the process of excluding the given word.

grep -v “linux” test.txt

With this command, we have excluded the lines where the word “linux” is mentioned and listed all the remaining lines. We may have a detail here, I have excluded the word “linux”, but the word may be written in the form of “Linux” or “LINUX”. We can control this as follows.

grep -i “linux” test.txt

In the same context, we can list files, not words. Let’s do this with the following command.

grep -l “a” *.txt

With this command, we have listed all .txt files with “a” in their name.

grep has a wider variety of uses, such as the ones above.

With the command below, we are looking for a 4-letter word that ends with s, without noticing the initial letter in the text in the essay.txt, and we color it with the color command.

grep –color ‘[a-z]..s’ test.txt

We can use the same coloring process when searching for a file in our directory. In the example below, let’s also list according to the first letter and the last letter.

ls | grep –color ‘^[Aa]’

by initial letter or

ls | grep –color ‘[Aa]$’

We have listed them according to the last letter with the command. Here, we can also use [0–9] instead of [Aa] to list files containing numerical expressions.

We can also use various classification expressions expressed as POSIX character classes. Let’s take an example to illustrate this. If we will use the ls command again, let’s list our files that contain only lowercase letters as follows.

ls | grep — color ‘[[:lower:]]’

So far we’ve talked about simple regular expressions. Let’s talk a little bit about extended grep (egrep). Simple regular expressions as we know ^ $ . It consisted of [ ] and * characters. In addition to the extended grep ( ) { } ? + | contains characters.

Let’s take an example of using extended grep. Let’s have a student.txt file and include student numbers and surnames.

Text File Content:20170602038 Cemre
20170602042 Canan
20160342467 Ali
20142425001 Ayşe

In such a file, we can color the numbers with the following command.

egrep — color ‘^0?[0–9]+’ students.txt

With a different command in the same file, we can check and list only the first 8 numbers among the numbers.
There is an additional expression we use here, which is the { } metacharacters. This expression, called quantifiers, allows us to specify max or min numbers. ‘^0?[0–9]{8}’

egrep — color ‘^0?[0–9]{8}’ students.txt

What is awk?

Awk is similar to grep and sed. Besides Awk, there are two advanced versions named NAWK and GAWK. It takes its name from the initials of its developers, Aho, Weinberger, and Kernighan.

Awk is expressed as a ‘…’ file.

Basic Operations with Awk

We can see its various uses in detail through examples.

E.g:
We can select certain parts of our files that we want to list with the ls command in any of our directories.

ls -l | awk ‘{print “File Name: “$9” File Permissions: “$1}’

Or suppose we have a name file. My names are in a text file called file.txt and they are all under one another.

awk ‘{print $1}’ file.txt

I am able to access my names using this command.

We can also use the -F argument as follows.

awk -F”\t+” ‘{print $2}’ file.txt

To mention the NF, NR and length arguments that we can use in commands;
NF : The number of spaces in the line separated by spaces,
NR : Number of rows,
length : Returns the number of characters in the line.

We can use these arguments as follows.

awk ‘{print NF}’ file.txtawk ‘{print NR}’ file.txtawk ‘{print length}’ file.txt

Additionally, we may want to specify a condition in the commands, below we want it to suppress all content with two field numbers (NF).

Text File Content:Cemre 809
Canan 905
Mustafa 997
Erol 720 Additional Area
Huriye 800
Günnur 600 Additional Field
Azra 650
Burcu 307
Halim 345 Additional Domain

When we run the following command for this text file, we do the filtering correctly.

awk ‘{if (NF==2) {print}}’ file.txt

In yet another example, let’s create a structure that allows us to query the data in our file.

E.g:
Let’s say we have a file like the one below. I set the name as file.txt.

Text File Content:1200
Cemre
Izmir

Let’s query this file with an awk file. I opened a file called myfile.awk and determined its contents as follows.

/[0–9]+/ {print “This line consists of a number.”}/[A-Za-z]+/ {print “This line consists of text.”}/^$/ {print “This line is a blank line.”}

Next, let’s type the required command.

awk -f myfile.awk file.txt

After running this command, we will have printed information about the data in the lines of our .txt file.

We can also go over the examples below for programming with Awk.

E.g:
Let’s open a new blank .txt file and make 4–5 lines of space in it. Then when we give the following command, it will count us our blank lines.

awk ‘/^$/ { print x++ }’ newfile.txt

We can calculate grades with awk as an example from a different angle.
Let’s create a note.txt file similar to the following structure.

Text File Content:Ahmet 56 67 83 90 75
Selim 75 88 92 100 60
Meric 45 38 90 72 81

And then let’s create a calculate.awk file and make its contents as follows.

{
total = $2 + $3 + $4 + $5 + $6
average = total / 5
print $1 , average
}

Now, finally, let’s give our command and print the average grade values of the people in the .txt file to the screen.

awk -f calculate.awk note.txt

Finally, as a small example, I tried to show you how to find large and small values with awk below, I think it will be useful for you.

Finding the Biggest and Smallest Value:

awk ‘BEGIN {max=-999; min=999} { {if ($1 >= max) {max = $1} } {if ($1 <= min) {min = $1} } } END {print max, min}’ file.txt

--

--

Cemre Acar
CARBON CONSULTING

A computer engineer who likes to create design, has a high business awareness, working as a Front-End Developer, tries to do the job in the best way.