# CodeFights Solves It: hostnamesOrdering

The Challenge of the Week that ended on March 21 was hostnamesOrdering. Only 72 CodeFighters submitted solutions — this was a tricky one! In this breakdown, I’m going to walk you through my thought process in writing a solution.

In this database challenge, the task is to take a table of numeric `id`s and `hostname`s, and return a sorted version of the table. The challenge is that we have to order the hostnames by the reverse hostname. The hostname is a string of domains separated by periods, such as `codefights.com` or `www.github.com`. The domains don’t include any protocols (such as `http://`), nor do they include any paths (such as `codefights.com/forums`). The general form of a hostname is `domain1.domain2. … .domainN`. The reversed hostname is `domainN. … .domain2.domain1`. For example:

[table id=6 /]

Note that the list above is ordered by the reverse hostname. We are told that there are at most `3` domains in any given hostname that appears in this challenge.

#### Getting started

Let’s start with an example of the table hostnames:

[table id=7 /]

Let’s suppose for a moment that we have a way of generating the reversed hostnames, which will actually be the hardest part of the challenge.

[table id=8 /]

The query we want to return is:

`# Assuming we can make a ‘reversed’ columnSELECT id, hostname FROM hostnames ORDER BY reversed;`

which would return:

[table id=9 /]

#### Reversing the domain name

We won’t actually construct an explicit reversed column. Instead, we will generate the pieces of the domain we want on the fly. To do this, we will be using the SUBSTRING_INDEX function. From the documentation, we see that `SUBSTRING_INDEX( string, delim, count)` returns a substring.

If `count` is positive, `SUBSTRING_INDEX( string, delim, count)` returns the substring from the beginning to the `count`th occurrence of `delim`. For example:

`mysql> SELECT SUBSTRING_INDEX('www.codefights.com', '.',1);          -> wwwmysql> SELECT SUBSTRING_INDEX('www.codefights.com', '.', 2);          -> www.codefightsmysql> SELECT SUBSTRING_INDEX('www.codefights.com', '.', 45);          -> www.codefights.com`

If `count` is negative, `SUBSTRING_INDEX` finds the `|count|`th occurrence of `delim` from the right, and returns the substring from there to the end of the string. For example:

`mysql> SELECT SUBSTRING_INDEX('www.codefights.com', '.', -1);       -> commysql> SELECT SUBSTRING_INDEX('www.codefights.com', '.', -2);       -> codefights.commysql> SELECT SUBSTRING_INDEX('www.codefights.com', '.', -100);       -> www.codefights.com`

Using the same example we’ve been working with, here is a query that splits the hostname into three pieces. Using `SUBSTRING_INDEX`, we can extract the top level domain, the midlevel domain, and the lowest level of domain just by changing the index value.

`SELECT id, hostname, SUBSTRING_INDEX(hostname,'.',-1) as top,                                      SUBSTRING_INDEX(hostname,'.',-2) as mid               FROM hostnames ORDER BY top, mid, hostname;`

This would work if every domain has exactly three pieces. Unfortunately, some domains only have one or two pieces. What is actually returned is:

[table id=10 /]

The first two rows are in the wrong order. They match at the top level (i.e., they are both `.com` domains). The `com` domain has no lower level, but the `SUBSTRING_INDEX` just returns the entire string `com` again. Since `codefights.com` is earlier in the alphabet than `com`, the table puts `codefights.com` first. What should happen is that `codefights` should be compared to the empty string, and `com` should appear first.

One trick to fix this problem is to prepend the hostnames with ‘…’, guaranteeing that there are three periods in each domain. So our query is now:

`SELECT id, hostname, SUBSTRING_INDEX(CONCAT('...', hostname),'.',-1) as top,                                      SUBSTRING_INDEX(CONCAT('...',hostname),'.',-2) as mid               FROM hostnames ORDER BY top, mid, hostname;`

Writing down just the first four rows, this query returns:

[table id=11 /]

This has everything in the right order!

#### A problem and a workaround

In principle, we just have to eliminate the top and middle column, so we are only reporting the `id` and `hostname`. Before making this change, we can try running this in CodeFights. When we do it, we run into a problem: the CodeFights MySQL engine returns `workbench.mysql.com` before `dev.mysql.com`!

This seems to be a bug; running MySQL on a local machine returns `dev.mysql.com` before `workbench.mysql.com`. However, ordering by the hostname on CodeFights returns the opposite. When using `ORDER BY top, mid, hostname`, `workbench.mysql.com` appears first. This seems wrong, since both these hostnames have identical top and mids, so the ordering should be determined by hostname.

Part of being a programmer is working around issues in any software! The problem we are encountering comes from doing multiple orderings. Our workaround will be to concatenate the top, middle, and hostname into one string. We will separate the top, middle, and hostname by a space, because spaces cannot occur in a domain and spaces appear earlier in ASCII than any letter.

So our workaround is:

`SELECT id, hostname, SUBSTRING_INDEX(CONCAT('...', hostname),'.',-1) as top,                                      SUBSTRING_INDEX(CONCAT('...',hostname),'.',-2) as mid                FROM hostnames ORDER BY CONCAT(top, ' ', mid, ' ', hostname);`

This orders `dev.mysql.com` and `workbench.mysql.com` correctly.

#### Eliminating columns

Now we need to eliminate the top and mid columns. We could do a subquery, but instead we will actually construct the top and the middle strings as part of the `ORDER BY` clause.

Our solution ends up looking like this:

`CREATE PROCEDURE hostnamesOrdering()    SELECT id, hostname FROM hostnames          ORDER BY CONCAT(SUBSTRING_INDEX(CONCAT('...',hostname),'.',-1),                                               ' ',                                               SUBSTRING_INDEX(CONCAT('...',hostname),'.',-2),                                               ' ',                                               hostname);`

This solution only selects the desired columns. Notice that in the `ORDER BY` we have a single string, made from the “top” level domain, a space, the “middle level” domain, a space, and then the hostname. This gets around the implementation bug, and gets us a solution with 188 characters.

On a local MySQL database, we don’t need the final concatenation, and we could use multiple ordering to get:

`CREATE PROCEDURE hostnamesOrdering()    SELECT id, hostname FROM hostnames          ORDER BY SUBSTRING_INDEX(CONCAT('...',hostname),'.',-1),                              SUBSTRING_INDEX(CONCAT('...',hostname),'.',-2),                              hostname;`

#### Going further

As a code golfing exercise, we have some compact solutions. As a coding exercise, you might be interested in trying to solve the general case where we don’t have a limit on the maximum number of domains in a hostname!

This takes us back to our original solution:

`SELECT id, hostname FROM hostnames ORDER BY reverse_hostname(hostnames);`

where the interesting exercise is to create a custom function reverse_hostname. If you get stuck, this StackOverflow post might be useful.

#### Tell us…

How did you tackle this challenge? What do you think about the approach I took in solving it? Let us hear from you in the CodeFights forum!

Originally published at CodeFight On!.