How We Moved Thumbtack to an Instant Matching Marketplace
By: Xing Chen
On September 26, Thumbtack announced a new feature called Instant Match. We’ve been working hard on this major shift in our marketplace, and I’m excited to discuss some of the technical challenges behind it.
The Bidding Marketplace
When you send a customer request on Thumbtack, we aim to match you with a professional (or “pro”) who is willing and able to do the job. The key to making a great match is not just finding a pro whose profile looks good, but rather finding a pro who will complete the task. The pro must be willing to respond, be available when you need them, and actually get the job done well, whether that is cleaning your house, helping you train for a marathon, or DJ’ing your wedding.
To solve this problem, Thumbtack originally structured the marketplace to encourage matches with high intent pros. Pros are sent customer requests, and pay to bid on the jobs that they want to do. Once a pro bids, they are highly motivated to close the job and complete the transaction. Today, the majority of our marketplace uses this structure, and it works — millions of jobs are completed through Thumbtack every year.
The challenge with today’s bidding marketplace is that it is high effort for pros: it takes a lot of time, energy and money to constantly scan through jobs, respond if they look good, and pay to try to get hired. As a result, this model fundamentally limits our supply of pros, since we only show a pro in the results if they bid on the job. So recently we asked ourselves: is there a better marketplace structure that removes the time and effort needed for pros to get hired on our platform, that could potentially improve our marketplace supply?
Instant Match Challenges
We began to build a new marketplace with the goal of drastically shifting our supply/demand balance by removing friction for pros. We hypothesized that if we collected enough data on pro interests upfront, we might be able to select jobs for them. If we did this, it would unlock a huge benefit for customers: if we knew what jobs could be done in a market, we could match pros with customers instantly, instead of waiting for a pro to bid.
To build a system for instant matching, we first built tools to let pros indicate their preferences for customers: preferences on geo areas, job preferences (e.g. what type of food they cater), calendar availability, and budget (weekly spend). For instance, we would have DJs tell us what kinds of events they do (e.g. weddings), the type of music they play (e.g. EDM), and where they work. By collecting these preferences, we were able to create an index of possible local services jobs. Then we could begin matching customer requests. We immediately ran into several challenges.
The “laundry problem”
Fun fact: it turns out that on average, house cleaning professionals are not really fans of doing laundry. But many house cleaning customers would in fact like their laundry done! Likewise, we see customer vs. pro preference mis-matches across our categories: dog walkers don’t like to offer grooming; IKEA builders prefer not to pick-up and deliver furniture; math tutors don’t like calculus, etc. Internally, we began calling this mis-match of preferences the “laundry problem” (side note: this problem is not unique to Thumbtack!).
One metric that initially made us suspect some sort of “laundry problem” was our pro capacity surplus — by August, we were only using ~25% of pros’ capacity on average, even as we were leaving customer requests unfilled at the same time. We saw charts like the following where we were leaving plenty of matches on the table, as defined by quotes per request (QPR).
This problem can manifest in many different ways. The first and most obvious way is when pros and customers just want different things: for instance, imagine that DJs in San Francisco don’t supply fog machines, but customers in San Francisco want them. This is fundamentally an exploration problem: once customers and pros define their job preferences, they only see jobs that fit those preferences — and no longer have visibility into the market outside of these preferences. Our approach to this is to increase market visibility to allow pros and customers to adjust their preferences and guide them to where there is availability in the market. For instance, in our first test, we suggested nearby results based on geographic distance, where we would suggests results just outside of a pro and customer’s geo preferences.
If a pro realizes that they are interested in these requests, they can easily add the job to their preferences and permanently expand their market coverage. Even this simple geo-based nearby results experiment resulted in significant increase in the number of instant matches we were able to make.
Even as we make progress on the exploration problem, the larger “laundry problem” presents bigger challenges that we are just starting to dig into. For instance, consider this optimization problem: imagine that there are two pros in San Francisco who can do house cleaning, “Ben’s Cleaners” and “Chris’s Cleaners”. Ben is happy to do laundry for customers while Chris is not, and each pro can only do one cleaning job per week. Now imagine that we see two customers in a week: the first customer does not need laundry done, while the second does. If we match the first customer who does not need laundry with Ben, then we are out of luck — Chris cannot take the second customer’s job since the customer needs need laundry. To tackle this problem, we need optimize our utilization of pros’ capacity against a forecast of future customer requests — a topic for a future blog post!
In the bidding world, our matching infrastructure was built to process requests in the background in order to send them to a number of pros via notifications. We optimized for perceived customer latency — since pros sent bids later on, we could immediately return a 200 to the client after making sure that we safely stored the request. Hence we had built an async architecture that used Amazon’s Simple Queue Service (SQS) to store requests to be processed by our matching service. At a high level, our architecture for matching bids looked something like this:
This gave us maximum flexibility to apply algorithms to select pros using a worker tier matching service that could take plenty of time to match requests. There were benefits: it was simple, and since we de-coupled matching from the client request, failures in matching would not affect the user experience. And since we had already stored the request, we could fix problems and retry as many times as we needed to process a request.
In the new Instant Match world, we now want to match a customer with pros who can do their job immediately. This meant that we now need to select, rank, and return pros for a given client request synchronously, instead of processing requests using async workers. Moving to this search-based system was not a trivial change. In our async workers, we had built algorithms that depended on many backend data stores and services, and did quite a bit of inefficient computation. To create a low latency search service, we would need to change the entire architecture.
Hacking our infrastructure
Before we changed everything, we wanted to quickly test an early version of Instant Match to see if it even made sense. We hacked the following on top of the old matching architecture: instead of having the matching service workers send our matches to a notifications service to notify pros, we would just send them back to the website service to store (step 1 below). Then, the website service would serve results back to polling clients (2). This created a pretty terrible, slow search experience — but we were able to ship a test very quickly, and it gave us early data that showed that the Instant Match experience could work. That gave us confidence to really invest in our infrastructure.
Instant latency matters
Needless to say, once we realized that Instant Match could work, one of the first things we did was actually change the architecture to become synchronous and remove SQS from the picture. Rather than workers doing processing, we wanted our architecture to look more like a search system. In this picture, a customer request (1) is served by a search index (2) which returns pro candidates to rank/filter and return back to the customer synchronously (3).
The first major infrastructure change was to consolidate our backend databases and services into a pro index supported by Elasticsearch (ES). This allowed us to greatly simplify our initial pro candidate retrieval based on geo and category. Previously, we had built the candidate retrieval on top of DynamoDB and PostgreSQL. We would first query pros from an index from S2 cell to pros stored in DynamoDB, and join with the latest fresh data from PostgreSQL to get recent pro updates, and then do in-memory filtering. All of this joining and filtering had been built in an async world where latency didn’t matter — but in the new world, it suddenly looked really slow.
Moving to Elasticsearch gave us a major speedup in latency, and also gave us benefits in flexibility: it now became much easier to change how we indexed and queried candidates (for instance, if we wanted to match based on city instead of zip code). It also involved building a system to update our index data in near real-time — again, a subject of a future blog post!
After a few months of building out and tuning our search infrastructure, we started to see some major improvements in our results latency.
As we worked through the challenges above, we began to expand our new Instant Match marketplace into more categories and markets as an opt-in feature. As adoption of Instant Match grew, we started to see evidence that this was going to have a major impact on the supply/demand dynamics in our marketplace. One of the metrics we have always tracked in a market is the percentage of requests on which we can deliver at least three matching pros with quotes, since we know that having three good choices is correlated with a good customer experience. By September, the rate of 3+ quotes per request had more than doubled in markets where we had fully rolled out Instant Match.
This is just the beginning of moving Thumbtack from a marketplace based on bidding, to a marketplace based on Instant Matching. We’ve just started optimizing our matching algorithms and building out new customer experiences that will leverage instant matching. Join us at Thumbtack and help us build the future of local services.
Originally published at https://engineering.thumbtack.com on September 26, 2017.