Adding products personalization to our Magento eCommerce
After migrating our eCommerce platform to the cloud with near zero downtime in our previous post, we started thinking about how we could take advantage of our new architecture to create a more customer-centric experience.
In this blog, we will cover how we utilised machine learning (with almost no knowledge of it) to add personalized product recommendations to our eCommerce in order to meet the individual needs of our customers.
I Love My Local Farmer is a fictional company inspired by customer interactions with AWS Solutions Architects. Any stories told in this blog are not related to a specific customer. Similarities with any real companies, people, or situations are purely coincidental. Stories in this blog represent the views of the authors and are not endorsed by AWS.
Going down the rabbit hole
Product personalization is popular feature used within eCommerce applications, but we’re a small team who has never had the in-house resources to implement it ourselves. Adding personalized recommendations for every individual user would take days, and we’d have to do it over and over again as user habits change. And let’s not forget that none of us really have a clue about machine learning. (Apart from seeing all the Matrix movies, but I think that may be a bit beyond what we’re trying to achieve here!) As a result of these constraints, we needed a managed solution we could ideally forget about once implemented.
One of the great things about Magento is that when we don’t want to do everything ourselves, we can always look in the Magento Marketplace. A quick search for ‘personalized product recommendations’ unearthed a number of solutions, with one in particular standing out as we’d already met the single prerequisite of having an AWS account.
The extension we found uses a service called Amazon Personalize. We did some further research into this to see if it would be enough to make a noticeable difference to our eCommerce (and to see if we could actually use it without a great deal of training) and were pleasantly surprised to find we didn’t need any machine learning (ML) expertise to get started. Additionally, we’d not only be able to use existing past data to set up recommendations, but Personalize could also make recommendations in real-time as events occur.
Given the nature of our business, we will sometimes lose products when no one has purchased them before the sell-by date. Whilst Amazon Personalize pushes products based off the user interactions by default, there is also an option that allows us to add our own objective. By choosing sell-by date as an additional objective, we could prioritise produce that is closer to going off in the recommender, encouraging users to buy it before we lose that revenue. The only problem with this is we’d have to calculate whether or not the lost revenue from an expired product is greater than the potential lost revenue from not recommending the most relevant products to a user.
There’s also the customer experience improvements to consider. Personalize is capable of generating recommendations even if it’s a customers first time using our website. And they do say first impressions count.
Excited by all the potential improvements Amazon Personalize could bring, we decided to install the extension to see if we thought the recommendations would actually bring value.
A Glitch in the Matrix (or AWS Configuration)?
The installation and configuration instructions provided were pretty straightforward. We were able to add it to the composer.json file before entering the licence key we were given with the extension and a few details about our AWS account. And then we had to wait a few hours for all the data we’d collected in the past 6 months to train a model for the first time.
Below is the complete configuration form (And yes, that really was everything):
Whilst this process was more simple than expected, it didn’t exactly go to plan the first time around.
The training process started out by exporting a bunch of interactions from the database into CSV files. Then, nothing happened. My initial thought was that I had entered some of the AWS account details wrong on the configuration page, but the process was still getting stuck after I had double checked everything. As this was being done on our development site I started to compare the differences to our production environment and realised that cron was set to only run manually in development. After manually running cron, the CSV files were finally uploaded to S3 so the data could be accessed by Amazon Personalize.
Building the Recommender
While waiting for the training process to complete, I had a deeper dive into what these generated CSV files were doing. All the CSV files are is a dataset formatted in a way Amazon Personalize can understand. The extension defines a schema describing the structure of each dataset, which cannot be altered once the dataset has been created. So if we wanted to add any fields we’d have to delete everything and start over.
The data used to train the model can be split into two sections:
- Historical Data. The extension pulls the products catalog and what products users are viewing and buying straight from the database. Only data from the past 6 months is used to keep it relevant. The data is split into 3 different datasets: items, users and interactions (with interactions being the most important here). The interactions dataset doesn’t just use purchase history, it contains data on what gets added to users carts, or even wish lists. This allows the recommender to essentially remind a user (and those with similar interactions) of products they may have forgotten they even wanted. The interactions dataset also includes anonymous user interactions, so we aren’t missing out on potential new customer conversions.
- Real Time Tracking. Once the initial model training is complete and we have our recommender, Amazon Personalize is able to continuously accept event data on product views and purchases as they happen. This means the model automatically stays up-to-date, but only for the product catalog initially used to train.
Interestingly, Amazon Personalize doesn’t support field types such as arrays and maps. Multi-value fields in a dataset such as categories become long string values separated by the pipe character.
The following high-level diagram shows how we obtain interactions data to create a recommender in Amazon Personalize:
After the model had been trained, recommendations were already being displayed on product pages and a new widget type called ‘Personalize Display’ became available for insertion on CMS pages.
I interacted with some products to try it out, adding a few pieces of fruit to my basket and could see my recommendations change.
Having seen the recommender in action within my development environment, I knew this was something we would want our customers to experience. But would it actually influence their purchases? Only time would tell. But we have to bear in mind that deploying Amazon Personalize fully would be a really expensive risk. We don’t know how our customers are going to respond to it and the pay-as-you-go pricing model could result in us suddenly accumulating huge costs that aren’t reflected in our revenue.
Luckily we’re not thrown in the deep end here. Arguably one of the best (in my opinion) features of this extension is the A/B split testing options. These allow us to show the recommendations to a certain percentage of customers to compare results. The available splits are fairly limited (50/50, 25/75) but I’d rather have limited options than none any day. To ease ourselves in, we plan to deploy this to our production environment using Personalize 25% of the time and to see how our revenue for that 25% of users compares to the rest. Depending how everything goes, we can staircase up to eventually using Personalize for 100% of our customers (fingers-crossed)!
It’s worth bearing in mind that having a recommender isn’t the cheapest thing in the world (especially with a real-time element), so we need to be sure that it brings enough value. Everything looks great in practice so hopefully the A/B split testing reveals that our customers are being influenced positively.
We also have to consider that our product catalog is changing all the time, so interactions data can get outdated very quickly is we don’t stay on top of it. We can import new items to our dataset as our catalog grows, but this is limited to 10 at a time. And we need to make sure that new products don’t get buried if they’re perceived as less popular by Personalize.
Additionally, we’ve chosen a solution that despite being easier to implement, allows less flexibility. So if we did want to start adding custom fields to the recommender we’d have to extend the plugin ourselves which isn’t always maintainable.
I have to say I was a bit terrified going into this as it was all unknown territory. But it’s been pretty easy to get everything set up apart from a couple of small snags along the way. And even once everything’s up and running the real-time element allows us to be pretty hands off. There’s also the huge potential for us increase revenue depending on the A/B testing results (another thing we only had to click a couple of buttons for), and hopefully it improves the shopping experience for our customers by letting them think a bit less.
Personalized recommendations is just one of the many ways we can take our eCommerce application to the next level. Keep an eye out for our next blog that will cover more ideas on what it takes to build a next-generation solution.