Leveraging the Power of Elasticsearch: Autocomplete and Fuzzy Search

Published in

manifoldco

10 min readSep 20, 2017

I’m going to show you how to build a simple search with autocomplete and fuzziness using Manifold, Express and deploying to Zeit. We will be building a simple web page that allows users to search for football players and display their details.

This tutorial will showcase some of the awesome features of Elasticsearch, but before we continue, we need to understand what autocomplete, fuzziness and Elasticsearch is.

Introduction

Autocomplete means giving users the option of completing words or forms by a shorthand method on the basis of what has been typed before. Pretty straightforward!

Fuzziness in the context of this article, means finding words that are resembling or similar by editing a character or more. There are four ways of finding fuzzily similar words: substitution, insertion, transposition and deletion. For example:

Substitute r for n: clea_r_ → clea_n_
Insert k after c: tic → tic_k_
Transpose a and e: b_ae_ver → b_ea_ver
Delete r: po_r_t → pot

Elasticsearch is an open-source, broadly-distributable, readily-scalable, really fast RESTful search and analytics engine. Accessible through an extensive and elaborate API, Elasticsearch can power extremely fast searches that support your data discovery applications.

If you don’t have any idea about Elasticsearch or Express, you should be able to follow the tutorial, however, I will recommend catching up:

We are good to go!

Set up

We will use Manifold to create our elasticsearch instance and logging. If you don’t have a Manifold account, head over and create one, it is very easy to set up. Once you’ve logged in, the first thing we are going to do is create a Project for our app. To do that, click on the big grey block that has the plus sign, then name our project and give it a description (description is optional).

You will then be able to create resources by clicking on Add a Resource button.

For this tutorial, we will using Bonsai Elasticsearch and LogDNA, so go ahead and provision both services. This will create a new Bonsai Elasticsearch and LogDNA accounts straight from your Manifold account. Awesome! To get your credentials, copy them by clicking on Copy to clipboard. You can also show your credentials by clicking Show credentials.

For Bonsai Elasticsearch URL, you can get that from your Bonsai dashboard. To get to the dashboard, Click on the name of the Bonsai Elasticsearch resource created.

Create a .env file in your project root folder and paste our credentials for both Elasticsearch and LogDNA.

BONSAI_URL=URL FROM BONSAI DASHBOARD 
LOGDNA_KEY=LOGDNA KEY GENERATED

Installing Node modules

Before we proceed, ensure you have node and npm installed. Open your terminal and run npm init, this will create a package.json file in your project root folder . Next, run this:

npm install --save elasticsearch express logdna node-env-file pug nodemon

We are installing the packages for Express framework, Elasticsearch and LogDNA. Pug will be used for our view template engine, node-env-file to for our .env file and nodemon to auto restart our server after making changes.

Setting up

LogDNA Logging makes it easy to find where and when things went wrong in our application. This is where LogDNA comes in. Create this file in the project root folder logdna.js and insert the following code:

In the code snippet above, we required logdna and node-env-file modules installed earlier. Next, we set our environment file, with the help of node-env-file and created our instance of LogDNA with our arguments in logArgs. Finally, we exported our LogDNA instance.

Basic concepts in Elasticsearch

Before we continue, let’s quickly go through some basic concepts in Elasticsearch.

A cluster is a collection of one or more nodes (servers) that together hold your entire dataset and provides federated indexing and search capabilities across all nodes.

A node is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities.

An index is a collection of documents that have somewhat similar characteristics. For example, you can have an index for customer data, another index for a product catalog, and yet another index for order data.

A type is a logical category/partition of your index whose semantics is completely up to you. In general, a type is defined for documents that have a set of common fields.

A document is a basic unit of information that can be indexed. For example, you can have a document for a single customer, another document for a single product, and yet another for a single order.

Query DSL is a JSON-style domain-specific language that you can use to execute queries, provided by Elasticsearch. This is an example:

GET /bank/_search
{
  "query": { "match_all": {} }
}

Setting up Elasticsearch

For this tutorial, I created a json file with details of 2500 dummy players, download it here, create a folder data in the project root folder and paste the file in there, i.e data\player.json.

Now, create elasticsearch.js in the project root folder, then copy the code below:

In the code snippet above, we required node-env-file and added our .env file, fs to access the file system and elasticsearch. Next, we created an instance of Elasticsearch using the Bonsai Elasticsearch URL from Manifold, as host. Then we defined INDEX_NAME and INDEX_TYPE, that will used later.

Next, add the code snippet below:

This function allows us to modify how we require the dummy data. This is necessary because our JSON file containing the dummy data, is not a valid JSON file and will fail when it is parsed.

Next, let’s add the functions that checks if the index exists, creates the index and deletes the index. Copy the code snippet below:

The next step is to create the index mapping. Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. Mapping is used to define if a field can be used for suggestions or is a date, string, number etc. Add the code snippet below to the existing elasticsearch.js:

In our mapping function above, we defined both firstName and lastName as type completion, this type allows completion suggestions to be made using those two fields.

For the search to work, we need to add documents. Therefore, we need a function to bulk add our dummy data. Add the code snippet below to elasticsearch.js:

From the code above, readDataFile() fetches our dummy data from the JSON file, which becomes the body of the request.

Next, we want Elasticsearch to be able to suggest players based on their first name and last name. However, names of player are sometimes difficult to spell, therefore, we want to be able to suggest players that have their first name or lastname fuzzily similar to the user’s search entry. Add the code below to the file:

From the code above, we have two suggesters: firstNameSuggester and lastNameSuggester. The first will return first names matching the search word while the second suggester will return for last names.

We want to be able to fetch the details of a particular player when we search, therefore, we need to have a search function. Place the code snippet below in your code:

The function above, searches and returns the details of a player, using the id passed.

Finally, we need to export the functions, so that they can be used in other files. Paste the code below at the end of the file:

Defining app routes

Routes are application endpoints that determine how responses are sent to a client’s request. In this application, we will need three routes: / to return the homepage, /suggest/:text/:size to return suggestions and /stat/:id to return the details of a player.

Create a folder routes, create a file index.js and paste the code snippet below:

In the code above, we created a router as a module. This router renders the homepage and sends the title of the page to the page template named master. The page template will be discussed later.

Next, create a new file suggest.js in the routes folder and add the code snippet below:

The router module above requires elasticsearch.js and calls the function getSuggestions, which returns a client-specified amount of suggestions for the search word from the client. Also, we logged the action using LogDNA.

Finally for routes, create a file stats.js in the routes folder and copy the code snippet below:

This route returns the details of a player based on the id passed into the getStat function. Also, the action is logged using the instance of LogDNA exported as a module in logdna.js

Setting up the server with Express

For the application to work, we need to run it on a server, this is where Express comes in. Express is a minimal and flexible Node.js web application framework that provides a robust set of features for web and mobile applications. In the project root folder, create app.js file and copy the code below into it:

In the code snippet above, we required the express and path module, also, we required the LogDNA & Elasticsearch modules and the routes we created earlier. Then, we created an instance of Express named app

Next, add the code snippet below to set the view engine, views folder and static files directory.

For this tutorial, we are using the Pug (formerly known as Jade), a high performance template engine.

Next, we need to check the setup of our Elasticsearch index and make sure that our data is stored correctly. Copy the code snippet below:

In the code snippet above, we did the following:

Check if the index exists
Delete the index if it exists and log the outcome
Create the index and log the outcome
Update the index with mapping and log the outcome
Bulk insert the dummy documents and log the outcome

Finally, we created our app endpoints and assigned the router modules accordingly. After that, we started the server on the port specified and logged it.

Building the Frontend

Now, we will build a simple frontend web page to consume the backend API, using Pug template engine to generate HTML and jQuery.

Create a folder views and create a file index.pug, the path should be views/index.pug and copy the code snippet below:

In the above code snippet, we created the HTML structure. Pug relies on indentation to create parent and child elements, therefore, we have to pay attention to that.

Next, we need to style the web page using CSS, create a stylesheet file public/css/style.css and paste the code below:

The above style declarations will make the page look a bit nice and simple.

Finally, we need to be able to show suggestions as users search, click on a result and get the player’s details. In this tutorial, we are going to make that possible by using jQuery; a javascript library, however, if you want to use another framework or library, feel free.

Now, let’s create a new javascript file using the path public/js/script.js, then copy the code snippet below:

In the script above, we have two functions: suggestion and stat as well as five event listeners: change, paste, keyup, two clicks.

The event listener $('.searchbar').on("change paste keyup", function() {...}); checks for when a user changes the value of the search box and sends a request using the suggestion function, which returns suggestions.

The event listener $(document).on('click', '.option', function(el){...}); is triggered when a user clicks on any of the suggested results, this sets the value of a hidden field with the ID of the player and also sets the value in the search box to the player’s name.

Finally, the event listener $(document).on('click', '.submit', function(el){...} is triggered when the user clicks on the Get Details button. This fetches the ID stored in the hidden input field, sends a request to Elasticsearch using the stat function and displays the player’s details below the search box.

The app is ready!

Deployment

Before we deploy, let’s quickly edit package.json in the root folder, so that we can start the app automatically, when we deploy. Add this to the file:

"scripts" : {
 "start": "nodemon ./app.js"
}

Finally, time to deploy! We will use a simple tool called now by Zeit. Let’s quickly install this, run this in your terminal:

npm install -g now

Once the installation is done, navigate to the project root folder in your terminal and run the now command. If it is your first time, you will be prompted to create an account. Once you are done, run the now command again, a URL will be generated and your project files uploaded.

You can now access the simple application via the URL generated. Pretty simple!

Also, we can view logs on LogDNA dashboard to make sure everything is working as it should be. To access the dashboard, see the image below:

The dashboard should look like below, with the log from our application.

Conclusion

We have been able to focus more on building our simple app, without worrying about infrastructure. The ability to provision resources very fast, all from a single dashboard means we get to focus more on writing code. Don’t stop here, get creative and improve on the code in this tutorial, you can find the source code for this tutorial on Github.