How to Use ElasticSearch With Django
What is Elasticsearch?
Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java.
What is Elasticsearch used for?
Elasticsearch allows you to store, search, and analyze huge volumes of data quickly and in near real-time and give back answers in milliseconds. It’s able to achieve fast search responses because instead of searching the text directly, it searches an index.
Elasticsearch — some basic concepts
- Index — a collection of different types of documents and document properties. For example, a document set may contain the data of a social networking application.
- Type/Mapping − a collection of documents sharing a set of common fields present in the same index. For example, an index contains data of a social networking application; there can be a specific type for user profile data, another type for messaging data, and yet another one for comments data.
- Document − a collection of fields defined in the JSON format in a specific manner. Every document belongs to a type and resides inside an index. Every document is associated with a unique identifier, called the UID.
- Field — Elasticsearch fields can include multiple values of the same type (essentially a list). In SQL, on the other hand, a column can contain exactly one value of the said type.
Using Elasticsearch with Django
Install and configure:
- Install Django Elasticsearch DSL: $pip install django-elasticsearch-dsl
- Then add
django_elasticsearch_dsl
to the INSTALLED_APPS - You must define
ELASTICSEARCH_DSL
in your django settings. - For example:
ELASTICSEARCH_DSL={
'default': {
'hosts': 'localhost:9200'
},
}
Declare data to index:
- Then for a model:
# models.py
class Category(models.Model):
name = models.CharField(max_length=30)
desc = models.CharField(max_length=100, blank=True)def __str__(self):
return '%s' % (self.name)
- To make this model work with Elasticsearch, create a subclass of
django_elasticsearch_dsl.Document
, create aclass Index
inside theDocument
class to define your Elasticsearch indices, names, settings etc and at last register the class usingregistry.register_document
decorator. It required to definedDocument
class indocuments.py
in your app directory.
# documents.py
from django_elasticsearch_dsl import Document
from django_elasticsearch_dsl.registries import registry
from .models import Category
@registry.register_document
class CategoryDocument(Document):
class Index:
name = 'category' settings = {
'number_of_shards': 1,
'number_of_replicas': 0
} class Django:
model = Category fields = [
'name',
'desc',
]
Populate:
- To create and populate the Elasticsearch index and mapping use the search_index command: $python manage.py search_index — rebuild
- For more help use $python manage.py search_index — help command
- Now, when you do something like:
category = Category(
name="Computer and Accessories",
desc="abc desc"
)
category.save()
- The object will be saved in Elasticsearch too (using a signal handler).
Search:
- To get an elasticsearch-dsl-py Search instance, use:
s = CategoryDocument.search().filter("term", name="computer")
# or
s = CategoryDocument.search().query("match", description="abc")
for hit in s:
print(
"Category name : {}, description {}".format(hit.name, hit.desc)
)
- To convert the elastisearch result into a real django queryset, just be aware that this costs a SQL request to retrieve the model instances with the ids returned by the elastisearch query.
s = CategoryDocument.search().filter("term", name="computer")[:30]
qs = s.to_queryset()
# qs is just a django queryset and it is called with order_by to keep
# the same order as the elasticsearch result.
for cat in qs:
print(cat.name)
Who uses Elasticsearch?
- eBay — with countless business-critical text search and analytics use cases that utilize Elasticsearch as the backbone, eBay has created a custom ‘Elasticsearch as a Service’ platform to allow easy Elasticsearch cluster provisioning on their internal OpenStack-based cloud platform.
- Facebook has been using Elasticsearch for 3+ years, having gone from a simple enterprise search to over 40 tools across multiple clusters with 60+ million queries a day and growing.
- Uber — Elasticsearch plays a key role in Uber’s Marketplace Dynamics core data system, aggregating business metrics to control critical marketplace behaviors like dynamic (surge) pricing, supply positioning, and assess overall marketplace diagnostics — all in real-time.
- Github uses Elasticsearch to index over 8 million code repositories, as well as critical event data.
- Microsoft — uses Elasticsearch to power search and analytics across various products, including MSN, Microsoft Social Listening, and Azure Search,
- Just Eat — Elasticsearch increases delivery radius accuracy as it can be used to define more complex delivery routes and provides real-time updates whenever a restaurant makes a change.
Thanks for reading. If you found the article useful don’t forget to clap and do share it with your friends and colleagues. :) If you have any questions, feel free to reach out to me.
Connect with me on 👉 LinkedIn, Github :)