Building a real-time elastic search engine using Python
Build and Deploy a Website using Flask and Digital ocean
In this article, we are going to build a course finder elastic search engine using Python and Flask. Then we will containerize our application and push to docker hub using Travis CI. So that every time you make changes to your application and push to GITHUB, Travis CI will containerize your application and push to docker hub. We will be using Nginx that acts as a reverse proxy for our application. After that we have our website we will deploy it on Digital ocean’s web server so that everyone on the internet can see what you’ve created and we will be creating a Free SSL certificate for your web application using Let’s Encrypt.
A simple tutorial on how to build a best tutorial finder search engine using Elasticsearch …
Step 1 — Create Docker Compose File
Docker Compose is a tool for defining and running multi-container Docker applications.
We will use the docker-compose.yml configuration file for creating application’s services. It’s a simple YAML file. After creating compose configuration file, we will start all the services from the configuration with only one command.
Now let’s write a docker-compose file to run Elasticsearch container.
- bootstrap.memory_lock = true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
This Compose file defines one service, The
- Uses an Elasticsearch image from docker hub.
- Forwards the exposed port 9200 on the container to port 9200 on the host machine.
- Using the network tags we can set up a network that we can use across different containers and project bases.
- With this setup the containers are connected to the frontend network, therefore, external containers can also connect with the frontend network to be able to access the services in it.
- We will store the indexed data in the Elasticsearch container to ES_DATA folder in our project folder. The basic syntax for mounting volumes is
We will start Elasticsearch service using the following command in our project directory.
Now hit localhost:9200 on your web browser to check whether Elasticsearch container is working or not.
Step 2— Create new elastic search indices
We will be creating two indices called autocomplete and hacker. hacker index will be using a template called search_engine_template.
Index templates allow you to define templates that will automatically be applied when new indices are created. The templates include both settings and mappings and a simple pattern template that controls whether the template should be applied to the new index.
When you run this create_new_index.py you will get the following output.
2. Created a new template: search_engine_template
4. Created an index: hacker
6. Created a new index: autocomplete
Let’s check whether new indices are created or not.
Step 3— Scraping websites with Python and Beautiful Soup and Ingesting data into Elasticsearch
We will scrap hackr.io website on
Tags then we will ingest scraped data into Elasticsearch.
Here is our Python Scraper that will scrape the data from a
hackr.io and ingest the data into Elasticsearch:
When you run this scraper you will able to see data getting scrapped and ingested to Elasticsearch.
Step 4— Building a Course Finder Search Engine from our Scraped Data
The directory structure of our Flask application looks like the below:
│ ├── __init__.py
│ ├── __pycache__
│ │ └── search.cpython-36.pyc
│ └── search.py
│ ├── css
│ ├── fonts
│ ├── images
│ ├── js
│ └── scss
First, we can install dependencies needed to run our flask application that is present in the requirements.txt file.
Install requirements using pip command:
pip install -r requirements.txt
Our Python Flask Application will render our HTML files using jinja templates.
In my case my application is named
We have set threaded as True to support multithreading in our Flask Application and registered a blueprint in our app.py file.
A blueprint defines a collection of views, templates, static files and other elements that can be applied to an application. For example, let’s imagine that we have a blueprint for an admin panel. This blueprint would define the views for routes like /admin/login and /admin/dashboard.
Our route will be named as routes/search.py
Our Index Page will be named
If everything was running according to plan, you should be able to run your application and it will listen on port 8005.
When you access your Endpoint on Port 8005, you should be able to see the main screen, which should look like this:
Step 5— Creating a Dockerfile for your Flask Application
We will be using a Gunicorn web server to deploy our application and we can set a number of workers in the gunicorn configuration file.
First, let us create a configuration for our Gunicorn web server called gunicorn_config.py
Let’s create our Dockerfile:
Step 6 — Create a new repository in the docker hub.
Go to https://hub.docker.com/ and create a new repository where we will be pushing our docker image using Travis CI.
Step 7— Using Travis-ci to containerize our flask application and push to Docker hub.
First, create a public repository in Github. Then sign up with https://travis-ci.org/.
We have to activate our repository in the travis-ci site so that we can make use of Travis CI/CD pipeline.
Got to https://travis-ci.org/account/repositories and activate your repository.
We then need to write .travis.yml in our project directory which contains instructions to deploy our application to docker hub.
I have used the environment variables $DOCKER_ID and $DOCKER_PASSWORD which can be set on your Travis repository page.
Choose More options →Settings to set Environment variables in Travis CI.
Now push your application to Github, Travis-ci will create a docker-container for your application and deploy to docker hub.
Step 8— Adding our Flask application services in our docker-compose file
Here is our updated docker-compose file
- bootstrap.memory_lock = true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
Here hacker services contain our flask application image which will be pulled from docker hub and it runs on port 8005.
We can run both the elastic search and hacker services using the following command which will start the application.
Step 10— Deploy our application to Digital ocean
Step 10.1 — Creating a Digital Ocean Account
Create an account for Digital Ocean by following this link.
Step 10.2 — Create a Droplet in Digital
Create a droplet in the digital ocean by following this link.
Step 10.3 — Now connecting to the remote server using SSH
Now connect to your remote server using the following command.
ssh <<USERNAME>>@<<IP Address>>
Now let us clone my application repository.
git clone https://github.com/dineshsonachalam/Building-a-search-engine-using-Elasticsearchcd Building-a-search-engine-using-Elasticsearch
After installing run docker-compose up to run your container services.
Now hit your <<IP>>:8005 port to see your application running
Step 11— Buying a domain name from Freenom
After buying a domain name from Freenom.
Go to My Domains →Management Tools →Nameserver → Change Nameserver → Use custom Name server.
Add your domain name in Digital ocean and copy the DNS record values for Type NS(Name Server) from Digital ocean and paste it into the Freenom custom name server.
I have bought a new domain name contentsea.tk and mapped to the digital ocean.
Now hitting contentsea.tk:8005 will run my container services. In your case <YOUR_DOMAIN_NAME>:8005 will run on port 8005.
Step 12 — Using Nginx reverse proxy to run our application using our domain name
Using nginx configuration our server on default runs on port:80. Here we are specifying when we hit contentsea.tk domain name run our hacker container services which run on port 8005.
Update your docker-compose.yml file that runs Nginx container service.
Then run your application using sudo docker-compose up. You can see your application when you hit your domain name.
Step 13 —Creating a free SSL certificate for your application using let's encrypt
Go to https://www.sslforfree.com and choose manual-verification(DNS) and follow the instructions specified. Then download all ssl certificate files where you will find ca_bundle.crt, certificate.crt, private.key
Now add contents from certificate.crt followed by cabundle.crt
Now copy the content from cabundle.crt and paste it at the bottom of certificate.crt
In your server create a directory named certs in the root directory and create a new file under the directory called certificate.crt and paste the certificate.crt file contents. Then create a new file called private.key and paste private.key contents.
root@ubuntu-s-1vcpu-1gb-blr1-01:~# mkdir certs
root@ubuntu-s-1vcpu-1gb-blr1-01:~# cd certs/
root@ubuntu-s-1vcpu-1gb-blr1-01:~/certs# touch certificate.crt
root@ubuntu-s-1vcpu-1gb-blr1-01:~/certs# vi certificate.crt
root@ubuntu-s-1vcpu-1gb-blr1-01:~/certs# vi private.key
Nginx is an extremely efficient and quite flexible web server. When you want to do a redirect in Nginx, you have a few options to select from, so you can choose the one that suits you best to do a Nginx redirect.
Whenever a user listens to port 80 it will be redirected to port 443 where https listens.
Our updated docker-compose file which contains
Now run docker-compose up to run our application and you can see https when you hit your domain name.
Finally checking whether you https certificate is valid or not using status code which returns when you hit your domain name. If your https certificate is valid it should return status code 200.
Here is a simple python code to check the status code of your application.
Thanks for reading and good luck!
DigitalOcean - Cloud Computing, Simplicity at Scale
Providing developers and businesses a reliable, easy-to-use cloud computing platform of virtual servers (Droplets)…
Join our community Slack and read our weekly Faun topics ⬇