How Redis helped us scale our product

Pickyourtrail
Pickyourtrail Tech
Published in
7 min readJun 13, 2019

Are you using applications in your product which are built for web-scale? Do you witness terabytes of data flow with a mild lag on various devices day in and day out? We feel you!

Retrieving enormous data from the Data Store across devices is an expensive process. Also, the amount of data being transferred through the network I/O, involving expensive CPU processes, including Disk I/O and making database round trips increases the speed and time. Now, this data that is retrieved once should be available for access by other services and access patterns. This is where cache comes into the picture.

In simple words, a cache is a software/hardware component that stores data so that future requests for that data can be served faster; this data could be the result of an earlier computation or just data. It’s a hit when the requested data is found and a miss when it is not found in the cache.

A local cache can help reuse the data accessed to an extent but that cache is local to the device it’s stored in and is non-persistent and can’t be used by other devices.

To solve this problem, we need systems that are distributed in nature, where components are located on various networked computers, which communicate and coordinate their actions by passing messages to one another. Basically, for a Distributed Architecture, we need a Distributed Cache.

What’s the effect of Distributed Architecture on Pickyourtrail?

The database is used for storing all static data inserted via our internal Content Management System (CMS). This may include basic information which never changes, like Regions, Countries, Cities, Activities, Hotels. Etc.

When the Pickyourtrail page renders on the UI, the frontend shows data which is received as an API response from the backend. The speed of each of these API results makes a lot of difference in the rendering of every page (based on response Time). The Backend team toils day in and out to make sure to write API that has a minimal response time. Retrieving the data from the database for calls to the same resource from various users can still become an overhead even if execution happens in parallel (1000 threads running at the same point).

Caching has become the de facto technology to boost application performance as well as reduce costs. The primary goal of caching is to alleviate bottlenecks that come with traditional databases. By caching frequently used data in memory — rather than making database round trips — application response times can be dramatically improved. The data stored in a distributed cache is quite simply whatever is accessed the most and can change over time if a piece of data hasn’t been requested in a while.

Distributed caching helps lower capital/operating costs. You can run caching software like Memcached/Redis/Hazelcast on low-cost commodity servers. Caching servers also help reduce network I/O and server workload.

Before we look and go deep into Distributed Caching, a brief primer on various caching techniques that we already leverage.

On the Front End Web application we already have:

Browser Caching through Headers -

Using Webpack to Optimize CSS & JS

Optimize Images without loss in quality

Local Storage of browser to store frequently used data

The various caching frameworks we’ve explored are:

Memcached — Open supply, superior, distributed memory object caching system, generic in nature, and supposed to be used in accelerating web applications by reducing database calls. Memcached could be an easy volatile cache server. It permits you to store key/value pairs wherever the value is prescribed to being a string up to 1MB.

Redis — Redis is an open source, in-memory data-structure store, used as cache. Some teams use Redis as a database and some as a message broker. It has all the capabilities of Memcached and outperforms it in some cases. It can also store key/value pairs with key and value limited to 2 GB except for String values limited to 512 MiB. It’s super fast too, however, often limited by network or memory constraints. Redis has a unique feature — support for data structures like sorted sets, hash sets and a pub\sub mechanism. One can also extend Redis via Lua scripting to build custom functionality on the Redis, ranging from simple things like reading JSON values directly to URL shorteners, et al.

HazelCast — Hazelcast is a clustered, in-memory data grid, that manages data and distributes process via in-memory storage and parallel execution for breakthrough application speed and scale. Hazelcast can also be embedded in a Java host process and this feature makes it stand out, making it useful for building stateful microservices without an external database dependency

What was the solution we came up with?

At Pickyourtrail, we chose REDIS to solve our application’s data structure.

Redis is not only a plain key-value store but also a data structures server, supporting different kinds of values. Conventionally, key-value stores let you store string keys and string values, but in Redis, the value is not only limited to a simple string but you can also store complex data-structures. To top that Redis also has first-class support for these data-structures out of the box.

The following is a list of all the data structures supported by Redis:

Lists: A collection of string elements and the order of insertion is maintained, linked lists basically.

Sets: collections of unique, unsorted string elements.

Sorted sets: Similar to Sets, where every string element is linked to a floating value, known as score. Elements are sorted by their score.

Hashes: Maps consisting of fields associated with values. Both field and values are strings.

Bit arrays (or simply bitmaps): Using special commands one can handle String values like an array of bits.

HyperLogLogs: this is a probabilistic data structure which is used in order to estimate the cardinality of a set.

Streams: append-only collections of map-like entries.

Redis Strings :

The String type is the simplest value you can link with a key.

Example 1: (SET/GET)

> set mykey somevalueOK> get mykey“somevalue”SET and GET are used to set and retrieve a string

Example 2: (MSET/MGET)

> mset a 10 b 20 c 30OK> mget a b c1) “10”2) “20”3) “30”

The ability to set or retrieve the value of multiple keys in a single command is also useful for reduced latency.

Example 3: (EXISTS/DEL)

> set mykey helloOK> exists mykey(integer) 1> del mykey(integer) 1> exists mykey(integer) 0

EXISTS command returns 1 or 0 to signal if a given key exists or not in the database, while the DEL command deletes a key and associated value, whatever the value is.

Redis Expiry :

Redis expiry is basically that, you can set a timeout for a key, which is a limited time to live. Once the TTL expires, the key is automatically purged, similar to the results achieved with the help of DEL command with the key.

Example 1: (EXPIRE)

> set key some-valueOK> expire key 20(integer) 1> get key (immediately)“some-value”> get key (after some time)(nil)

The key disappeared between two GET calls, as the second call was hit post 20 secs.

Redis Lists :

Redis lists are implemented via Linked Lists. This means that even if you have millions of elements inside a list, the operation of adding a new element in the head or in the tail of the list is performed in constant time.

Example 1: (LPUSH/ RPUSH/ LRANGE)

> rpush mylist A(integer) 1> rpush mylist B(integer) 2> lpush mylist first(integer) 3> lrange mylist 0 -11) “first”2) “A”3) “B”

The LPUSH command adds an element at the top of the list, while the RPUSH command adds an element at the bottom of the list. LRANGE command extracts ranges of elements from lists:

Example 2: (RPOP / LPOP)

> rpush mylist a b c(integer) 3> rpop mylist“c”> rpop mylist“b”> rpop mylist“a”

Popping elements is the operation of both retrieving the element from the list, and eliminating it from the list, at the same time. You can pop elements from left and right, similar to how you can push elements in both sides of the list

Redis Hashes :

Hashes are useful to represent objects, and the number of fields you can store against a hash has no practical limits

Example 1: (HGET/ HMSET/ HGETALL)

> hmset user:1000 username antirez birthyear 1977 verified 1OK> hget user:1000 username“antirez”> hget user:1000 birthyear“1977”> hgetall user:10001) “username”2) “antirez”3) “birthyear”4) “1977”5) “verified”6) “1”

As seen above command HMSET sets multiple fields of the hash, while HGET retrieves a single field. HGETALL return array of key-values.

Redis Sets:-

Redis Sets are unordered collections of strings.

Example 1: (SADD/ SMEMBERS)

> sadd myset 1 2 3(integer) 3> smembers myset1. 32. 13. 2

The SADD command adds new elements to a set.

Example 2: (SISMEMBER)

> sismember myset 3(integer) 1> sismember myset 30(integer) 0

Checking if an element exists or not.

Redis — Pub-Sub server :

SUBSCRIBE, UNSUBSCRIBE and PUBLISH implement the Publish/Subscribe messaging where senders can keep sending messages without worrying or knowing about who is going to receive these messages. Published messages can be subscribed by any subscriber without having to worry about the publishers or the source. This independence of sender and receiver allows for greater scalability and more dynamic network topology.

This was a very brief introduction on the aspects of caching and caching technologies we had explored at Pickyourtrail. Keen on knowing more or got thoughts on how caching can be improved? Please a drop us a mail at vinayak@pickyourtrail.com and we’ll kickstart the discussion :)

--

--

Pickyourtrail
Pickyourtrail Tech

India’s leading online travel company that delivers tailor-made international holidays. Drop us a line at planners@pickyourtrail.com and get packing.