Autocomplete search is a really common service in the variety search engines or learning apps, like as dictionary and others. Recently, I try to complete a prefix search api, and there are two common types to implement autocomplete search function as follow:
- Elasticsearch: prefix search
- Redis: sorted set
Elasticsearch
Elasticsearch is very strong to help building a search engine, but you need modify configuration based on your applications. This article just try to build a very simple prefix search engine based on default config.
1. Installation:
- Mac
brew cask install adoptopenjdk
brew install elasticsearch
- Linux
sudo apt-get install elasticsearch
2. Elasticsearch properties:
- Restful Api
- Json in / Json out
- Multiple approached: Prefix search, totally search
- Distributed cluster
3. Prefix search deploy steps: demo by Postman
- Customize settings and mapping config if you need, or you can start with default config. It’s enough to simple search.
- Put index: create a new index, for instance: movies
curl -X PUT “http://localhost:9200/movies/"
- Post data: save data to particular index
curl http://localhost:9200/movies/movie/ -X POST -H ‘Content-Type: application/json' -d
{
“name”:”Christopher Robin”,
“genre”:”Comedy”,
“summary”:”A working-class family man, Christopher Robin, encounters his childhood friend Winnie-the-Pooh, who helps him to rediscover the joys of life”,
“yearofrelease”:2018,
“metascore”:60,
“votes”:9648,
“rating”:7.9
}
- Search data by prefix query
curl http://localhost:9200/movies/_search -X GET -H ‘Content-Type: application/json' -d
{
“query”: {
“prefix” : { “name.keyword” : “Th” }
}
}
- Search data in particular key
curl http://localhost:9200/movies/_search -X GET -H ‘Content-Type: application/json’ -d
{
“query”:{
“multi_match”:{
“query”:”planet”,
“fields”:[“summary”] }
}
}
- You can also search by regex and others.
Redis: sorted set
For the most part, redis is used to in-memory cache, or is used to queue broker. Furthermore, Redis is more strong than your imagination. There are several data structure in redis table, and we also can use redis sorted set to complete the function of prefix searching.
1. Prefix search steps
- Add words in your sorted set table
DICT_KEY = "complete"
def add_word(word):
redis_client.zadd(DICT_KEY, 0, word)- Inserted pre-order and post-order keys
PREFIX_SEARCH_CHARACTERS = '!”#$%&\’()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~'def insert_order_keys(prefix):
order_key = bisect.bisect_left(PREFIX_SEARCH_CHARACTERS, prefix[-1:])
suffix = PREFIX_SEARCH_CHARACTERS[(order_key or 1) - 1]
_id = str(uuid.uuid4())
start= prefix[:-1] + suffix + '{' + _id
end= prefix + '{' + _id
redis_client.zadd(DICT_KEY, 0, start, 0, end)
return start, end
- Get prefix search results and Remove redundant keys
def prefix_search(prefix, start, end):
redis_client.zadd(DICT_KEY, 0, start, 0, end)
pipeline = redis_client.pipeline(True)
while True:
try:
pipeline.watch(True)
start_index = pipeline.zrank(DICT_KEY, start)
end_index = pipeline.zrank(DICT_KEY, end)
size = min(start_index+10-1, end_index-2)
pipeline.multi()
pipeline.zrem(DICT_KEY, start, end)
pipeline.zrange(DICT_KEY, start_index, end_index)
items = pipeline.execute()[-1]
break
except redis.exceptions.WatchError:
continue
# filter
return [item for item in items if b'{' not in item]Conclusion
This article just propose one of the methods, and it must exist others autocomplete ways by different tools and different programming languages. However there is no doubt that autocomplete is a technology hotspot in web searching world. It make us convenient in typing and expanding our imagination.
