Learning while teaching — Class 3

Intro

Hi everyone :) Could you complete the challenge from last class? Yes, No? Did you understand what you have done in these lines of code? Found some problem? If so, feel free to reach me at me@lucavgobbi.com or join my Slack channel. If you don’t know how to do it, just sent me an email and I’ll send you an invite.

Today’s class we will continue to work on our application, in the theoretical part I’ll explain a little bit about Non blocking IO and callbacks and in the practical part we are going to create a middleware. Don’t worry if you don’t know what a middleware is. We will learn this later today.

Also, I was reading the Class 1 and found out that maybe I wasn’t clear enough about Class and Prototype. So I’ve made some changes and highlighted them for you, go back to Class 1 and take a look ;).

The code, where is it?

The complete code will be available on my GitHub under https://github.com/lucavgobbi/lwt-class3

Oh, some of you might be thinking why I’m using a new repository for each class and not just tags or branches or whatever. The reason is that right now the classes are not “connected” on code, which means that the code from one class will not be used on the next one, probably in the next classes we will start something more “connected” and only one repository will be used :).

IO and Non blocking IO

You probably have already listened or read the word IO, if you haven’t it stands for Input Output which means everything that your app “receives” from outside and “return” to outside. The most basic example is your keyboard and monitor: you can input data to your app using your keyboard and the app can output data back to you through your screen.

Network communications, DVDs (do they still exists?) hard drives, SSD, mouse, touchscreen and lots of others are IO devices and when you need to access them from your code you are performing an IO operation.

Blocking IO

If you have already used other programming languages before you might have used commands like .readKey() or .print(); they of course vary from language to language but these two for example read a key pressed on the keyboard and print something to the user’s screen. For most languages these are blocking operations, which means that your code will stop in these functions until they are completed. For the first it waits the user press a key and for the second it waits the Operational System concludes the drawing on screen.

In our kind of application, which is a web application, the most common IO is network IO, every document requested and responded is an IO which reads something from the network interface and writes something back. Of course a lot of the steps involved such as HTTP protocol, TCP protocol, IP protocol, hardware communication and others are abstracted for you and you don’t have to think about it, although I highly recommend you to do some reading at these topics to have a better understanding of the whole process.

So, back to networks IOs, if the IO is blocking when the app started listening to requests no other code would be executed because the app will be blocked waiting for an IO (yes you can work with threads but remember that we are not using any threads in our code) when we receive an request, the app will execute your code for that kind of request and write the response. When we have finished writing the response we have to wait until it was sent and only after that we could accept a new request. In summary: when you call an IO operation you need to wait it to finish before executing the next line of code.

Non blocking IO

So, what is the difference here? As you might imagine, in a Non blocking IO function if I need to perform some IO operation my code continues to execute even if I still don’t have the result of the operation. This situation is easy to imagine and understands for operations that you don’t need to know the answer or the result, for example: I want to print something on user’s screen or save a file to the disk but I don’t need to know if the operation was completed (as a rule of thumb you always want to know if some operation was completed without error). But what happens for operations like reading a file, or reading the user input? These are the kinds of operations that you need the result. Well while using node.js and JavaScript all these operations are Non blocking, or asynchronous if you prefer, which means that after calling a function to read the keyboard the rest of your code continues to run and in order to obtain the result you must have a callback or a promise (In this early stage we will only study callbacks but later I’ll talk about promises). When the Non blocking function finish it will execute your callback with some parameters. For instance, if you want to read a file you might use a function to do it, this function will receive some parameters like the file path and the callback function (in JavaScript everything is an object, even the functions, so we can use them as parameters). And guess what? After the file was read (reading a file usually means having its content available to be used in the program) you callback you provided as a parameter will be executed! Awesome right?

Callbacks!

But where is the content of the file? Well, this is what in my experience people take some time to understand: the file content is passed as a parameter to your callback. If you look Node’s documentation about fs.readFile() you will see that the callback receive two parameters, err and data. The parameter err contains any error that could occur during a file read such as File not found. If err is null means that everything is ok and you can access the content of the file inside data. If you are already program in other language you might be asking yourself: Why not throw an error? And the answer is related with Node’s nature, since the reading is a non-blocking operation and we probably have more code running at the same time, do we want the program to stop? Of course not! You thing that you will notice is that almost every callback will have err as a first parameter, although it’s not required this standard in very used and highly recommended for every callback (keep that in mind when writing your functions) since most developers using node expect this behavior.

So after this “short” explanation I hope that you now have a better idea of what a Non blocking language/operation is and what a callback is. Now, let’s go to the code ;).

Middleware

Do you remember that word from the beginning of the post? Now is time to dig into it. Middleware like the name suggest is something in the middle :), it’s used to glue the code, to add functionality between two other functionalities.

In this class we will focus on Express.js middleware and we will apply the two concepts which you learned earlier today: callbacks and Non-blocking IO.

Setting up

Let’s start with our setup, we will need Node.js and NPM (if you are not sure of what I’m talking about read this) and we will run some of the same commands in terminal.app

npm install express-generator -g

This will install the express-generator, a command tool that will help us create our Express applications faster. You might have noticed the “-g” command right? This is telling npm to install this package globally so it can be accessed everywhere.

After running this command, we can use the express-generator com create all the skeleton of our app, to do so, go to the directory where you would like to create the app and type:

express lwt-class3

You can always change lwt-class3 to whatever name you want.

You might see something like that:

express-generator output

And your folder structure might look like this:

Folder Structure for a new project.

Notice that express-generator created the package.json file for us, and if you look inside it you will see that a lot of dependencies are already defined inside:

{
"name": "lwt-class3",
"version": "0.0.0",
"private": true,
"scripts": {
"start": "node ./bin/www"
},
"dependencies": {
"body-parser": "~1.13.2",
"cookie-parser": "~1.3.5",
"debug": "~2.2.0",
"express": "~4.13.1",
"jade": "~1.11.0",
"morgan": "~1.6.1",
"serve-favicon": "~2.3.0"
}
}

Express-generator saved us a lot of time, this is basically everything that you need to start your app!

Well, now that we have our package.json ready what should we do? You got it, just go to its directory and type:

npm install

And once again NPM will take care for you ;).

Ok, great now our app has all its dependencies installed what about trying to run it?

Remember that last time we just asked node to execute index.js file? Well this time express-generator created a better structure for us, way much more organized than just one file.

To run this app, we have a file inside “bin” directory called “www”. Inside the root app directory, you can type:

node bin/www

But wait? Nothing happened right? Well actually, your server is already running :), open your browser (IE is not a browser) and type http://localhost:3000 in the address bar. If you see something like this means that it worked!

Yessssss it is runninggggg :D

And in your terminal.app you might see this:

terminal.app of a running express application

Understanding Middleware

Ok, now let’s get back. With your app up and running we can start talking about middlewares. In the root folder of your application there is a file called app.js, this is the main file of your application and its were the middlewares are defined and attached to the application. Inside this file there are some lines of code like these:

var logger = require('morgan');
var cookieParser = require('cookie-parser');
var bodyParser = require('body-parser');
app.use(logger('dev'));
app.use(bodyParser.json());
app.use(bodyParser.urlencoded({ extended: false }));
app.use(cookieParser());
app.use(express.static(path.join(__dirname, 'public')));

We’ve already talked about require: it’s the way that you request other files, modules or packages to be included and made available in your code.

The trick part here is the app.use this is how we attach the middleware to be executed.

Let’s take as example logger, it is the middleware responsible to write these beautiful things in your terminal.app every time that someone requests something from your server. By requiring it, we are asking node to search for this module and make it avaible in the logger variable. Notice that we use .require(‘morgan’). Morgan is the package’s name and you can check more about it in their website. Some lines below we do app.use(logger(‘dev’)), here we are telling our app instance (check class 2 if you have no idea what I’m talking about) that it should use logger(‘dev’).

So logger(‘dev’) will then return an instance of Morgan configured for the ‘dev’ or development environment. But wait, where is the new keyword? How can you create an instance without new? Well, this is what we call Factory, and it is a Software Design Pattern and its one of the most common ways to create instances in Javascript, for instance, the express app: app() is also a Factory method! There are several reasons why people use factories instead constructor in Javascript, we will not talk about it today, but what do you need to know is: A factory, return you an instance of something given some parameters. No big deal right? Great, now let’s try to understand what happens next.

Attaching or use(ing) a middleware in your Express app means that every request your app get will be “handled” by that middleware. It’s important to understand that middlewares execute in order. Express ensures that if logger is the first middleware it will be the first to handle your request. Keep that in mind cause a lot of problems can come from using middlewares in the wrong order. Now, you might be thinking: If node is a Non-block IO, how does it control the execution order of middleware? The answer is next().

The next, req and res parameters

Every time that your server receive a request Express will look all the middlewares attached and will deliver the request to the first one, how it do it? Simple: callback, your middleware is basically a function which expect 3 parameters: req, res and next does it sound familiar? The req parameter contains all the request data, the res parameter contains all the response that is being built and next is another callback! Wait? My middleware which is basically a callback will receive another callback function? Yes, welcome to Javascript! In the end of your function you MUST call next() otherwise the execution will be trapped in your middleware. Calling next() will make express execute the next middleware in the queue.

As defined in Express.js website:

Express is a routing and middleware web framework with minimal functionality of its own: An Express application is essentially a series of middleware calls.

Your controllers (more of that in the next classes) are essentiality middleware! If you keep that in mind you will understand better the next classes.

Coding all that

Now we are going to code and test so you can understand the different behaviors, remember to run the application type in your terminal.app

node bin/www

First we will create our middleware function and attach it to the code, you should insert this code right after app.set('view engine', 'jade'); probably in line 17.

Now, we are creating a variable requestKeeperMiddleware which is a function that receives req, res and next. This functions only .log() into the console the .rawHeaders which is part of the req parameter and contain the request’s HTTP Headers.

And attaching it as the first Middleware of our app with app.use(requestKeeperMiddleware);

Run the application, access http://localhost:3000 and take a look at your console. You should see new information like this:

Middleware console output example 1.

But wait, the page doesn’t load, nothing happens after that? What is wrong? Well, we forgot to call next().

This brings us to the next interaction in our code:

Now we added a call to next(), since it is a function we use () after the name.

If you run and load the page you will see the same weird data on your terminal.app but now the app loads :) great, we are making progress right?

The next step is to understand why the order matter, so let’s create another middleware:

Now, when you run and request the page you might see this is your console:

Was request kept? undefined
Keeping request
GET / 200 432.399 ms — 170
Was request kept? undefined
Keeping request
GET /stylesheets/style.css 304 3.538 ms — -

What happened? Notice that the first middleware attached is checkKeeperMiddleware, this means that we are running this code first, the variable requestKept was not created in req, it is undefined. Now try to change the order:

app.use(requestKeeperMiddleware);
app.use(checkKeeperMiddleware);

Run again and check your terminal.app:

Keeping request
Was request kept? true
GET / 200 432.399 ms — 170
Keeping request
Was request kept? true
GET /stylesheets/style.css 304 3.987 ms — -

Great, now it’s working, did you see how we added a new value to req variable and in the next middleware we were able to read this value? Lots of middlewares do exactly that, they process some information an store it on req or req to make it more easily available for you, bodyparser is a good example, it reads the request and add the body to help you work with the values!

Also notice that both middlewares execute twice, this happens because you webpage has an instruction to request a .css file from the server: when your browser receiver the HTML page it interprets the HTML and one of the instructions is to download a .css file. Remember: every request that your server receive will execute all middlewares!

Right now these middlewares are kind of useless right? So let’s make something good from it!

Let’s write a file for every request with some data from it!

Please, keep in mind that this is just an example! Don’t really do it in a real application!

So first we need to comment the code for the second middleware, it was there just to explain the concept for you.

And I’ve made some changes in the code, let’s take a look.

Let’s break down the first line:

var log = req.method + ‘ to ‘ + req.url + require(‘os’).EOL + req.rawHeaders;

We are creating a variable called log and attributing a value to it, first req.method will return what HTTP method was used in the request, like GET or POST, the + is how we concatenate strings in JavaScript, then we have ‘ to ‘ which is a string and req.url which return the requested URL.

Next we .require(‘os’) inline to use the .EOL, notice that we can use require inline without attributing it to a variable, OS module is the Operational System module, .EOL is the value of end of line for the operational system, we use this to break the line. Finally we have req.rawHeader which we have already talked about.

The second line is easier:

var filename = “logs/request-” + (new Date()).getTime();

We create a variable called filename with a string ‘logs/request-’ and we concatenate it with a (new Date()).getTime(), this last part create a new instance of Date() and gets the UNIX Timestamp of it.

But why we are generating new file names? Remember that this is a Non Blocking IO function, which means that we don’t have control of order or when it will be executed, if we use the same file without any further control we can have two “people” writing the file at the same time and this can create problems. You need to keep in mind that functions must be as isolated as possible, which means that they need to execute correctly independently of external conditions and variables.

Next we .require(‘fs’), this is a Node module which give us access to the file system.

Then we call .writeFile(…) using filename and log as arguments, the first is where the file should be saved, the second is the content to save inside the file, but there is a third parameter which is the callback, remember that the default for most callbacks in node was having err as the first argument? .writeFile(…) is one of these functions :). The callback will be executed after the file was written, so we never know when this will happen, in a general way, Node will request the Operational System to create this file, open the file, write the file content, close the file, release the pointer to the file and … in summary, there is a lot going on, a lot of things can go wrong and we will never know how many time this will take or even when it will start. That is the why we can’t rely on execution order for Non Blocking functions!

So, after all the file write is completed the callback will be called and the err variable will contain any error that might occur or null if nothing went wrong. This is our first check, if(err) when err is null is false, remember, null == false in Javascript. When this condition is true we write the error in console and return. Returning (stopping the function) is a good practice when errors occur in async functions, this avoid the rest of the function’s code to execute.

If nothing was wrong, we write into the console a message.

The last line is next(), It’s important for you to understand the closures in here, next() is executed right after we “dispatch” the execution of .writeFile(…), spend some time to understand the brackets and parenthesis. We have no certain when the callback will be called and when the file was wrote.

So now, finally if we run our program we might see this output on terminal.app

{ [Error: ENOENT: no such file or directory, open ‘logs/request-1447381531031’]
errno: -2,
code: ‘ENOENT’,
syscall: ‘open’,
path: ‘logs/request-1447381531031’ }
GET / 304 5562.566 ms — -
{ [Error: ENOENT: no such file or directory, open ‘logs/request-1447381536650’]
errno: -2,
code: ‘ENOENT’,
syscall: ‘open’,
path: ‘logs/request-1447381536650’ }
GET /stylesheets/style.css 200 5.733 ms — 111

Opsssss, error? This is not good, but when writing the file something went wrong and the callback was called with error. Luckily we outputted the error to console and we can analyse it. Looks like the directory logs was not created. Yeah, node is right :( so let’s create de directory inside the root directory of your app and try again.

Here is the expected output:

GET /stylesheets/style.css 200 5.733 ms — 111
GET / 200 32.404 ms — 170
The file was saved!
GET /stylesheets/style.css 304 1.204 ms — -
The file was saved!

Better uhn? Now look the files created inside the logs directory.

Now take some time and try to explore the req object and write some useful information in this file. Maybe you can try to create another middleware.

Keep in mind that this middleware that write a file for each request is not a good idea, it was just a useful way to explain Non blocking functions, callbacks and middleware, YOU SHOULD NOT DO SOMETHING LIKE THIS IN REAL LIFE!!!

The End

Now I hope that you have a better understanding of all the concepts that we talked about today: Non blocking IO, Callbacks and Middleware. Again, if you have any doubts or just want to give me a feedback: me@lucavgobbi.com or @lucavgobbi on Twitter, you can also join our Slack group, just email me asking for an invite!

If you liked this class, please share :) I would love to see you sharing it around Twitter.

See you next class! Bye ;)