Building a basic router

There is always value in learning about the internals of the frameworks and libraries we use. It allows for a deeper understanding of the problem being solved and appreciation of the work that has gone into these projects.

So today I will be building a basic router to explore this fundamental part of even the smallest framework. The idea is not to create something complete or production-ready but rather the minimum set of features needed to be considered a router.

In order to call this a router it must be able to handle static routes, deal with variables within a route, and output an error when a matching route is not found.

Even with such a tiny set of functionality there are decisions to be made regarding the design and implementation.

URLs

The first decision to be made is the format of URL to be used. Rewrite functionality will be used to allow for semantic URLs (eg. http://example.com/this/is/the/route).

Query string based routing (eg. http://example.com/?this+is+the+route) will not be implemented and therefore will not be available as a fallback.

This means that before the router can have even the most basic features the web server must be correctly configured.

The same entry point, index.php, will always be used. In addition the name of this file will not appear in the URL.

Both of these requirements can be handled with two simple lines in an Apache .htaccess file.

For Nginx a similar rewrite may be used.

Normally there would be more rules involved to allow other resources such as CSS, images and JavaScript to be served correctly.

In both cases one should ensure that the REQUEST_URI parameter is being set as this will be used to determine the requested route.

Routing table

The next decision to be made is how to structure and where to store the routing information. This entails making a list of all possible routes and, for each route, specifying the action to be taken.

Often this information would be stored in a separate file, in the database, or created by different modules as they are loaded.

To keep this design as simple as possible an array will be used to store the list of routes. Each entry in the array will have a key to specify the route and the value will contain the code to run for that route.

A route of “404” will be used when no other matching route can be found. Other routes will be specified by name, including the leading slash. This means that the default route will be “/”.

The code to be run for each route will be written as an anonymous function. These functions will return the data to be output for that route.

To start lets add the default route, the “404” route, and a “/hello” route which will return a “Hello world” greeting.

Static routes

Keeping with the theme of simplicity the page header and footer will always be the same. The router will then be used to decide the content of the HTML body.

With a page structure in place, the next step is to implement the core of the router.

Each route is checked in turn against the REQUEST_URI. If a match is found then the associated anonymous function is stored in the $response variable.

If all routes have been exhausted, and the $response variable has not been set, we use the “404” route function.

Finally the chosen function is executed and the returned data is output to the body of the HTML.

At this point one can define any number of static routes and write code to have them give us the results we want.

Parameters

Static routing is not very exciting. Other than cleaner looking URLs it doesn’t give much functionality over using separate files for each page.

Adding parameters allows one to do dynamic routing. Regular expressions will be used as an quick way to do this. This implementation will not allow for named parameters and will require that any special characters are escaped.

Subpatterns, defined with parentheses, will be used to specify where parameters are within the route.

The existing routes must be updated to add the required escaping. Then a new “/count” route containing a parameter can be added. This new route will use the range() function to return the whole numbers between 1 and the number held by the parameter.

The code is then changed to use preg_match() for route matching and call_user_func_array() to process the chosen $response function.

There is plenty more that could be done and many more features that could be added but I think this gives a great overview on what is needed to implement a router.

While writing the code above I found myself considering how it could be changed to be a full RESTful router, how optional or named parameters could be introduced, and refactoring the router into a class.

Going through this exercise was informative. It got me thinking about the design choices made and how these choices affect the functionality. For example, choosing to use regular expressions to add dynamic routing was easy to do, but introduced the requirement of escaping special characters.

I hope it gets you thinking too.