How to make up-selling recommendations based on cart using Redis
Being able to suggest additional products on an eCommerce is powerful feature. On a typical product page, you’d like to show others products directly related or similar, and usually that’s not so hard (it could be though). But when user is finally on the cart’s summary page, we’d like to show (if possible) products based on the whole cart’s content: something inspired by amazon’s famous section “people who bought those items also bought”.
When current user did add multiple specific items to his cart, we suggest him with other items included in previous orders ordered by popularity. Of course recommending is far from exact science, but matching other carts with same items could help us find really relevant items especially with large catalog.
Here is an example that we will use in this tutorial. It’s basically just a small catalog, a user’s current cart and a small order history:
We will split the problem in two distinct parts:
- finding all previous orders matching the current cart’s product list
- getting other products from those carts aggregated, counted and ordered by frequency
As typical shopping cart system is generally implemented with a relational database, extracting recommendations directly from it is perfectly feasible. But as you will need for each recommendation the complete orders history, and depending on its volume, you could quickly face performance issues.
As an alternative, let’s see how Redis can help us out writing a fast recommendation engine without impacting our relational database.
Finding all matching previous orders
As previously said, getting results from your existing database could work. For our example, data model could be as simple as two linked tables:
So finding those orders with SQL could look like this:
select id
from orders
where exists ( select 1
from order_items
where order_id = orders.id
and order_items.id = 'itemA')
and exists ( select 1
from order_items
where order_id = orders.id
and order_items.id = 'itemB' )
To help us solve this problem with Redis, we will focus on one its data structure: Redis sets. Redis Set is a really simple unordered collection of strings. As Redis is a totally external system, before any operation you will need to duplicate (and keep up-to-date) some data outside your transactional database straight into Redis. For every item of your catalog, you will create one dedicated “redis set”, and associate all related orders to it. Redis only works with string so you’ll need to use a unique reference/id for each item and order.
Considering the previous example with 4 orders, this is how we would initialize all our sets:
127.0.0.1:6379> SADD itemA order1 order2 order3
(integer) 3
127.0.0.1:6379> SADD itemB order1 order2 order4
(integer) 3
127.0.0.1:6379> SADD itemC order1 order2
(integer) 2
127.0.0.1:6379> SADD itemD order2
(integer) 1
127.0.0.1:6379> SADD itemE order3 order4
(integer) 2
After each new order, you’ll need to update some of these sets depending of the order’s content. For example, if someone make a 5th order with itemA and itemD, you’ll need to update these two sets:
127.0.0.1:6379> SADD itemA order5
(integer) 1
127.0.0.1:6379> SADD itemD order5
(integer) 1
Now, redis have this really great feature that let you find the intersection of a list of multiple sets: SINTER. So for our example, we can find all existing orders which have both itemA and itemB (current cart’s content):
127.0.0.1:6379> SINTER itemA itemB
1) "order1"
2) "order2"
Here’s a visual representation of this operation. You can notice that for the current cart in our example, sets for itemC, itemD and itemE are not used at all.
We now know that there are two existing orders matching our current user cart’s content. In the next step we’ll use this result as input to search for possible recommendations.
Extracting most frequent items to recommend
Now that we know the short list of orders we need let’s how to get items we want to recommend. With relational database, getting those items could look like this:
select count(*), item_id
from order_items
where order_id in ('order1', 'order2')
and item_id not in ('itemA', 'itemB') //remove current cart’s items
group by item_id
order by count(*) desc
Again, to solve this problem with Redis, we’ll need another piece of data outside your relational database: for each order we’ll store a key with every ordered items. We will use another data structure with just a little more features than before: Redis Sorted Sets.
Let’s initialize sets with complete order history:
127.0.0.1:6379> ZADD order1 1 itemA 1 itemB 1 itemC
(integer) 3
127.0.0.1:6379> ZADD order2 1 itemA 1 itemB 1 itemC 1 itemD
(integer) 4
127.0.0.1:6379> ZADD order3 1 itemA 1 itemE
(integer) 2
127.0.0.1:6379> ZADD order4 1 itemB 1 itemE
(integer) 2
Of course, for any new orders, you will need to create new sets. For example, if someone make a 5th order with itemA and itemD:
127.0.0.1:6379> ZADD order5 1 itemA 1 itemD
(integer) 2
Sorted sets are pretty similar to Sets, but each element inside a set has a score (we won’t really use it). Using the short list of orders we got from previous step, we are able to group, aggregate and count all items in those different orders. Redis just won’t let us do this in a unique step. We need 3 successive operations.
First, we will group all orders from previous operation (only order1 & order2 in our case) with: ZUNIONSTORE. It will produce a list of items with aggregated count. As said before we don’t need scores, so we will use all default parameters for both WEIGHTS option (1) and AGGREGATE (Sum):
127.0.0.1:6379> ZUNIONSTORE cart17 2 order1 order2
(integer) 4
As the name of the previous function might suggest, we did store the result of our aggregation in a new destination key “cart17”. Even if we won’t need to store this result for a long time, we still need a destination to perform other operations right after. As we don’t want current cart’s items in recommended results, it seems a good option to remove them early on:
127.0.0.1:6379> ZREM cart17 itemA itemB
(integer) 2
Now we can extract the final result with ZREVRANGEBYSCORE. We need reverse order because the higher score are for items with higher frequency, and we want those items first. You won’t need the WITHSCORES option, but it is useful to check while investigating results. ItemC wins with score of 2 because it was counted twice: in order1 and order2.
127.0.0.1:6379> ZREVRANGEBYSCORE cart17 +inf 1 LIMIT 0 10 WITHSCORES
1) "itemC"
2) "2"
3) "itemD"
4) "1"
Bingo ! With the final help of the transactional database because Redis only gives you an identifier for the items, we can now show some nice recommendations on the cart page.
Maybe the destination key “cart17” can be used as a cache or something, but generally we won’t need it anymore so it is possible to free it entirely from Redis:
127.0.0.1:6379> DEL cart17
Bonus: Some more advanced techniques exist. It is also possible to use transactions with MULTI operations to perform all the operations on sorted sets as a whole and without keeping a temporary key for storing result.
Conclusion
As said, getting recommendations directly from your relational system is pretty easy to setup but could not fit very well with your existing database system. Depending on volumes, cons could be:
- unsatisfying execution times
- execution cost adds overhead on your transactional database
On the other side, Redis, as an external system has serious advantages:
- get quick results even at scale
- no complex queries
But with obvious cons:
- add redis to the stack (if not existing yet)
- load data inside redis and keep it updated
That’s all folks !