Using Machine Learning to Understand How Branding in Photos Affects the Car Shopping Experience
By: Samad Patel
This blog post delves into how we answered a challenging business question using pre-trained AWS Models. Our question required us to parse text from photos, then analyze the contents of that text. We used AWS Rekognition and Comprehend to extract and classify text from photos, followed by a few highly interpretable statistical methods to analyze the data.
TrueCar is a digital automotive marketplace that makes car buying simple, fair, and fun. Shoppers can click through live inventory, see custom-tailored photos of the cars they’re interested in, and contact dealers with a few clicks. The custom-tailored photos are the shopper’s window into what might be their next car, so first impressions matter. Some dealerships choose to brand their image by adding a personal touch to the photograph, such as their dealership name and contact information.
If a shopper sees branding, they may choose to contact the dealership on their own some other time, or they may choose to unlock more information about the vehicle through TrueCar and then choose how they want to proceed with the dealer. Therefore, we’re interested in analyzing how shoppers respond to branding in photos; specifically, we’re looking to see if there are any inherent differences between how shoppers behave with dealerships who brand vs those who don’t. If we understand these differences, we can enhance the shopping experience by working with dealerships to provide shoppers with more of what they want.
Our problem statement is to understand whether shoppers shop differently with dealers that brand and dealers that do not. To see if there is any difference, we decided to measure the difference in sales between these two groups. We’ll specify precisely how we used sales metrics in the Measuring Sales Impact section. But in order to tackle this problem, we needed to figure out which dealers brand their inventory images in the first place.
Data Collection with AWS Data Lab
We have terabytes of dealer inventory images in S3, so if there was a service that could take dealer images as input and then determine if there’s branding as output, then we could aggregate branding metrics at the dealer-level. This is where the AWS Data Lab came in. We approached them with this problem, and with their consultation settled upon a pipeline that could do just that.
Below are the steps we took to process 50 million photos for this analysis:
- A photo is read from S3 to be passed to Rekognition.
- Rekognition parses any text that exists in the photo.
- If Rekognition finds any text, that text is passed to Comprehend.
- Comprehend classifies the dealer information (such as name, phone number, etc.) the text contains, if any.
- The response data from both services is saved to a DynamoDB table for every photo.
After processing all 50 million photos, we aggregated the data to a usable format for our analysis. We were primarily interested in two statistics: how many photos each dealer branded, and how many of their vehicles had at least one branded image.
Why did we care about how many vehicles had at least one branded image? When a shopper looks at a specific vehicle, they can see all the photos of the vehicle the dealer has uploaded. Our hypothesis is that if dealer-branding causes shoppers to contact the dealer, it might not matter how many photos per vehicle are branded; if a dealer just brands one of their images per vehicle, they still get their contact information out on every single page we display, a shopper is guaranteed to see that dealer’s contact information, and the shopper can control their experience from there.
Consider the example data above. We can observe that both dealers have 100 photos for their inventory of 15 cars. Both dealers also have a total of 15 images branded, meaning they both brand the same proportion of their photos. However, the data within their Branded Vehicles Count column is different. Every single vehicle for Dealer A has a branded photo, so any shopper that views that dealer’s inventory is bound to see a branded image. Dealer B, on the other hand, only branded images for 5 out of its 15 vehicles, so shoppers are less likely to see a branded image when browsing their inventory.
In other words, though both dealers branded the same number of photos, the patterns behind which photos they chose to brand tell a very different story.
With the data properly aggregated, we were poised to measure the impact of branding on sales.
Measuring Sales Impact
Instead of using sales directly, we chose a metric that contains more information. We have a model (called the Franchise Pricing Tool, or FPT) that predicts how many sales dealerships will have. Naturally each prediction has some error, which is expressed as prediction error = actual sales — predicted sales. This metric naturally encodes information surrounding the sales, and our expectation of what those sales were going to look like; those expectations guide many business decisions. Therefore, we pulled FPT’s prediction errors for the same dealerships we had branding data for.
We looked to see if there was any relationship between branding and our prediction errors with two basic approaches: a regression analysis, and a t-test.
Linear regression is an approach to modeling the relationship between a response variable and independent variables. In this case, we used linear regression to measure how the prediction error changes relative to the branding features we generated from the Data Lab. For example, if the prediction error increases as the number of branded photos increases, then linear regression can help us quantify precisely how the branded photos affect the response. Overall, regression could inform us about which branding features, if any, are predictive of the response.
Additionally, if branding features show a strong relationship with sales, that would inform us that we could use this pipeline with AWS to engineer features to improve FPT.
However, in this case, we found that none of the branding features were statistically significant in explaining the variance we saw in the prediction error. This was an indication that there was no notable difference between the two groups.
T-Tests can be used to check if there is a significant difference in the means between two groups. In order to run this test, we had to explicitly split our dealers into those who brand their images versus those who don’t. We noted that dealers can brand their photo to different extents, so it’s not a strictly Boolean category. We called dealerships “High Branders” if they had 2 or more vehicles branded; a dealership was a “Low Brander” otherwise.
In the image below, we can see the distributions of prediction errors (which is labeled “Sales Delta” on the X-axis). As you’d expect, they’re centered around 0; however, it’s worth noting that the distributions are roughly the same. From the eye test, it doesn’t look like there’s a substantial difference. When you actually calculate the mean between both groups, the mean for “High Branders” is 2.7 sales, and for “Low Branders” it’s 0.2.
The t-test results suggested that this difference in means of 2.7–0.2 = 2.5 sales actually was statistically significant! It’s a relatively small difference, but it’s still evidence that we’re underpredicting sales for High Branders. Our new hypothesis is that dealers who brand their photos spend more time and attention on how they present their inventory to customers, which likely has some signal towards the overall experience they offer and therefore their likelihood of outselling less meticulous competitors.
In this blog post, we demonstrated how the AWS Data Lab provided us with the exposure and guidance to work with pre-trained AWS models to generate data that answered a challenging business question. We combined that data with one of our in-house models, and performed highly interpretable statistical tests to approach the question from different angles. Ultimately, both approaches yielded the same conclusion: dealer branding in photos displayed to shoppers doesn’t appear to affect shopper behavior to a noticeable degree.