How Moovup recommends jobs with Amazon Personalize

Published in

Moovup

6 min readMar 11, 2023

Moovup is an online platform for front-line jobs, such as retail, restaurant, and logistics jobs. To provide jobs that match user’s interests, a personalized job searching experience is the key. For example, if a part-time waiter is searching for jobs in Moovup, we recommend jobs related to food and beverage to him. These personalized jobs are shown in:

The “Recommended jobs” (推介好工) section on the homepage
Job searching page

Personalized job ads in the job searching page

To implement this feature, we integrated Amazon Personalize to our backend. Amazon Personalize is a machine learning service that helps developers easily integrate personalized recommendations to the website with no machine learning expertise required.

In this article, we share how did we develop a personalized job recommendation feature with Amazon Personalize. To show job recommendations to our users, we have two main steps:

Create a training model in Amazon Personalize.
Build a data pipeline to connect with Amazon Personalize.

The data pipeline needs to continuously:

Send user’s interaction data to our training model in Amazon Personalize.
Fetch user’s personalized job recommendation and show it on the front-end side.

Part 1: Create a training model in Amazon Personalize

Workflow of creating a training model in Amazon Personalize

Let’s begin with creating a training model. The diagram above shows the steps of creating a training model in Amazon Personalize. In Amazon Personalize, a training model is called a “solution version”. Here are the details for each step:

Create a dataset group. It is a container for all of our resources, such as datasets, solutions and campaigns.
Create three datasets, including items, users and interactions. They are containers for our item, user and interaction data, which are used for training our model.

*Three datasets we created in Amazon Personalize*

3. Upload our user, item and interaction data in CSV format to S3 bucket. We upload our job data, user data and user’s clicks data in CSV format to Amazon S3. Amazon Personalize will fetch those data from Amazon S3 for training model later.

4. Create dataset import jobs to import those CSV files. Specify the Amazon S3 location of the files in the console. Meanwhile, create schemas that fit our datasets.

For the schema of a dataset, fields should match all column headers in our CSV file. Take “item dataset” as an example, we have fields such as JOB_NAME and JOB_TYPE.

Additionally, we set these fields with categorical or textual properties to be true. The reason is to let Amazon Personalize train a model based on these fields. JOB_NAME has an infinite set of values which is unstructured data. Yet, JOB_TYPE is not. It has a fixed set of values, part-time or full-time. That is why we set textual for the former but categorical for the latter.

{
 "fields": [
  {
   "name": "JOB_TYPE",
   "type": [
    "null",
    "string"
   ],
   "categorical": true
  },
  {
   "name": "JOB_NAME",
   "type": [
    "null",
    "string"
   ],
   "textual": true
  },
        ...
 ]
}

5. Create a solution, which means training a model. We use the User-Personalization recipe for the training. To optimize the training model, we can enable HPO (hyperparameter optimization), by adding hpoconfig in the solution config. Amazon Personalize will run many training jobs with different values within the range we specify. The trade-off of it is to have longer training time which causes higher cost.

After the training is finished, the performance of the training model is shown in solution version metrics. A higher score (closer to 1) means better performance.

6. Create a campaign to deploy the training model, by selecting the solution version created in the previous step.

7. Apply a filter to filter out unwanted recommended items. For instance, we can specify only getting promoted job instead of normal jobs, by including an expression Items.JOB_PROMOTION_STATE in (“Posted”) below:

Include ItemId WHERE Items.STATE IN ("Posted") AND 
Items.IS_JOB_SUSPENDED IN ("false") AND 
Items.IS_COMPANY_SUSPENDED IN ("false") AND 
Items.JOB_PROMOTION_STATE in ("Posted")| Exclude ItemId 
WHERE Interactions.event_type IN ("apply_job_complete")

Now, we have our training model ready to give us job recommendations for different users. By calling Amazon Personalize’s GetRecommendations API with a user id, it returns a list of job ids. We can test the campaign by inputting a random user id 123 in the console:

It returns a recommendation ID and a list of item ids with their scores.

Part 2: Build a data pipeline to connect with Amazon Personalize

Next, we build a data pipeline to fetch user’s job recommendations and send their data to our model at the same time. As mentioned at the beginning of the article, our goal is to recommend jobs that fit user’s interests in near real-time. Hence, we need two features:

Collect and send user’s data to Amazon Personalize and update our training model. User’s interaction data tells the training model what kind of jobs the user is interested in.
Fetch user’s job recommendations from Amazon Personalize and show them to the user when they are searching for jobs in Moovup.

Moovup’s data pipeline of Amazon Personalize

Here are the steps we have for the data pipeline:

When a user clicks or applies for a job, our website or mobile application sends an event to Firebase, which is linked with BigQuery.
Create a scheduled job with Amazon EventBridge. Every 10 minutes, it fetches user, job, and interaction data from PostgreSQL and BigQuery. Then, we send these data in batches (10 records for each batch) to Amazon Personalize by calling the PutEvents, PutItems and PutUsers API.
Every two hours, Amazon Personalize updates the latest model automatically. It includes the new data we sent to the model before.

After a while, the user’s interaction data is added to our model in Amazon Personalize. When they are searching jobs in Moovup, we go through the steps below:

Our server sends a request to fetch a list of job recommendations with their user id, through the GetRecommendations API. Then, Amazon Personalize returns a list of job ids to our server.
Our server loops through the ids and get job details for each id.
Our server sends a list of jobs with their details to client. Now the user can see jobs that are recommended by Amazon Personalize.

What if a new user accesses our application for the first time? They have not clicked or applied for any job in the application yet, which means the training model does not have their interaction data. In this case, Amazon Personalize will return a list of popular jobs instead of personalized job recommendations, as mentioned in the documentation:

For new users without interactions data, recommendations are initially for only popular items

Summary

The aim of integrating Amazon Personalize into our backend is to recommend job that matches user’s interest in near real-time. We achieved it by creating a training model in Amazon Personalize, and building a data pipeline to get job recommendations from Amazon Personalize for different users. As a result, we successfully improved our user’s job searching experience and employers’ efficiency of recruitment in Moovup.

How Moovup recommends jobs with Amazon Personalize

Part 1: Create a training model in Amazon Personalize

Part 2: Build a data pipeline to connect with Amazon Personalize

Summary

Written by Alysa Chan