How to Make Basic Search Engine for Django Project -Haystack+ElasticSearch-

Even though there are so many Koreans who can write and speak English very well, I am not good at writing English. So, there might be wrong sentences and incorrect grammars. If there is fatal mistakes, please give me feedback through comments.

In this posting, my purpose is to make basic search engine using Django-haystack and ElasticSearch. I have referenced Haystack Documentation a lot, so I will attach it below.

https://django-haystack.readthedocs.io/en/master/tutorial.html

Let’s start.

1) Install

Because ElasticSearch update speed is so fast, you have to check the version so carefully. Though current ElasticSearch version is 6.X, current Haystack 2.8.1 version is only compatible with ElasticSearch 2.X version.

However, there are unreleased code in Haystack Github and it is compatible with ElasticSeach 5.x version. So, I will use this code.

Please input below command in your git bash.

git clone https://github.com/django-haystack/django-haystack.git

If you did not install git, please install it. It’s not that difficult.

After you cloned django-haystack, move this folder to the folder where other python package installed. You can check it by the following steps.

import sys
sys.path

You have to install Python ElasticSearch library. Because you are going to use ElasticSearch 5.X version, you should write command like this.

$ pip install "elasticsearch>=5,<6"

Python ElasticSearch 5.X version will be successfully installed. Now, it’s time to change settings.py

2) settingss.py

INSTALLED_APPS = [
    # add
'haystack',
'elasticsearch',
]
#Add this code
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.elasticsearch5_backend.Elasticsearch5SearchEngine',
'URL': 'http://127.0.0.1:9200/',
'INDEX_NAME': 'haystack',
},
}

It’s not that difficult.

3) ElasticSearch Install.

I will install elasticsearch 5.x version because Haystack library is incompatible with above versions.

If your OS is Linux, please input this command.

$wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.4.tar.gz

Are you window user? Click this link.

https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.4.tar.gz

ElasticSearch 5.6.4 version will be installed.

Now, add your ElasticSearch paths in System environment variables.

add path in Environment Variables.

Command elasticsearch.

$ elasticsearch

Can you see “started” in last sentence? Then, ElasticSearh is activating!

3) Modify models.py

This is not essential part. I’m just trying to add one model to help you understand. If you have another model you want to use, use it.

class Note(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
pub_date = models.DateTimeField(auto_now_add=True)
title = models.CharField(max_length=200)
body = models.TextField()
    def __str__(self):
return self.title

4) Create search_indexes.py

In your app directory, create search_indexes.py file. Write this code.

import datetime
from haystack import indexes
from myapp.models import Note
class NoteIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True, template_name='search/note_text.txt')
author = indexes.CharField(model_attr='user')
pub_date = indexes.DateTimeField(model_attr='pub_date')
    def get_model(self):
return Note
    def index_queryset(self, using=None):
"""Used when the entire index for model is updated."""
return self.get_model().objects.all()

This is the code for changing RDB data to indexing document.

See ‘documnet=True’ in text field. It means that you will use text field when you search it. Only one “document=True ” is available.

def get_model is for choosing model. Because I want to transform Note model, I write return Note. So, this code will transform Note RDB model to Index documents.

In the author and pub_data fields, you can see model_attr, which must be a column in the Note model.

def index_queryset is function for filtering objects. I write return objects.all() because I will index all objects.

In the text field above, I specified the template file path to template_name. You can freely specify the folder name and file name, but if possible, create a search folder in the template folder and manage it separately.

5) Create Template file

The template file here is not an html file, it’s txt file. The role of this file is different from the role of the html template files. This file defines what results to display in the above indexes.py file with text field(document=True). If you do not understand it, let’s move on.

{{ object.title }}
{{ object.user.get_full_name }}
{{ object.body }}

In the file, write the codes as above. Remember that title, user, body are all columns in the Note model. When the indexing is successfully completed, the text field will contain the title, user model, and body contents.

6) Modify URLS.PY

You also add code in urls.py.

url(r'^search/', include('haystack.urls')),

7) Create Search Template

Now, let’s make html file. I just copied codes in Haystack documentation. File path and name is search/search.html

<h2>Search</h2>
    <form method="get" action=".">
<table>
{{ form.as_table }}
<tr>
<td>&nbsp;</td>
<td>
<input type="submit" value="Search">
</td>
</tr>
</table>
        {% if query %}
<h3>Results</h3>
            {% for result in page.object_list %}
<p>{{ result.object.title }}</p>

{% empty %}
<p>No results found.</p>
{% endfor %}
            {% if page.has_previous or page.has_next %}
<div>
{% if page.has_previous %}<a href="?q={{ query }}&amp;page={{ page.previous_page_number }}">{% endif %}&laquo; Previous{% if page.has_previous %}</a>{% endif %}
|
{% if page.has_next %}<a href="?q={{ query }}&amp;page={{ page.next_page_number }}">{% endif %}Next &raquo;{% if page.has_next %}</a>{% endif %}
</div>
{% endif %}
{% else %}
{# Show some example queries to run, maybe query syntax, something else? #}
{% endif %}
</form>

If you have followed me well, you will see the following file structure.

Folder Tree Example.

If there is no problem, command ‘rebuild_index’

python manage.py rebuild_index
Indexing Command

It’s almost there. Now let’s make sure that the search function works successfully. My current Note Model contains the following data.

Note DB

I’ll show you the search results. Go to the page 127.0.0.1:8000/search and search for ‘hot’.

Search Page

The result corresponding DB contents were returned. But, it does not actually return the results from the DB. It just finds documents that contains “hot” and returns them.


Because I am not good at writing English and this is my first English posting, there might be many errors in sentences and contents. If there is any problem, please give me feedback by comments.

Next time, I will write ‘how to make autocomplete function through Haystack and ElasticSearch’. Thank you.