Leveraging ML techniques for Used Car Pricing @ CARS24

Naresh Mehta
CARS24 Data Science Blog
5 min readFeb 13, 2019

Pricing Data Scientists at CARS24 work on huge inspection & auction data points to iteratively enhance used car price predictability

https://www.cars24.com/

Till late 90s, very few Indians could afford a car and those who did treated it like an extended member of the family. Selling the family car was not really an option — unless it was ready to be scrapped. And likewise, buying a used car was rare and rather frowned upon.

Cut to today, cars are as ubiquitous in India as the potholes on the roads they are being driven on. Buying or selling a used car is no longer a taboo — people are upgrading much faster, selling off existing cars (~4 yrs being the most likely age of a car being resold) and buying either premium used cars or brand new ones. This change can be attributed to burgeoning middle class with ever increasing disposable income and higher mobility (relocating across cities, countries).

As per industry estimates for India, 1.25 used cars were transacted for every new car transacted in 2018. In other words, used cars accounted for ~55% of total 7.5 million car transactions in India. As per forecast, used cars will account for ~75% of total transactions in India by 2025 (as is the case in US & other mature markets) more than doubling from current ~4.2 million to ~9 million.

However, despite the rapidly increasing demand and supply in used car ecosystem, there are still massive information asymmetries in this largely unorganized vertical in India, making it a fairly painful and inefficient process for buyers and sellers alike.

Inefficiencies of Used Car ecosystem

Unlike new cars where price and supply are fairly deterministic and managed by OEMs (except for dealership level discounts which comes into play only in the last stage of customer journey), used cars are very different beasts with huge uncertainty in both pricing and supply.

Mike Baldwin perfectly capturing the uncertainties around used cars!

The value at which a used car gets transacted in unorganized ecosystem completely boils down to the awareness, price sensitivity, patience and negotiation skills of different stakeholders in the transaction chain and has often very little to do with true price a given vehicle deserves.

Same is the case with the supply itself — a prospect seller today, may not be that keen on selling his/her car few days later and vice versa. Not to mention, huge geo-arbitrage e.g. demand for a used Honda City could significantly over index in Outer Delhi while there would be supply surplus in South Delhi.

We at CARS24 are trying to solve for these inefficiencies in the used car ecosystem leveraging real-time digital auctioning & transaction platform; customer focused retail fronts; state of art inspection approach; and advanced ML techniques in pricing & buyer-seller matching algorithms.

Pricing Model — Steady improvement with advanced algorithms and increasing data points

Having transacted ~135K cars and inspected & auctioned ~550K cars since inception about 3.5 years, we have access to huge training data to establish relationship of transacted price point (where both buyer & seller agreed on a certain price) with large set of predictors ranging from obvious ones like ex-showroom price of the variant, age, odometer reading, internal & external condition, engine transmission & suspension quality; to more nuanced ones like expected maintenance cost over time, degree of documentation (insurance, loan NOC, road tax), duplicate key availability etc; to completely external factors like localized demand-supply gap, regulatory factors (e.g. Delhi-NCR region has banned diesel cars above 10 years of age), introduction of new variants/face lift for a given model… you get the drift, in a nutshell, every used car is an individual SKU with a unique price at a given point in time.

To make it clearer, refer image below to see huge percentage deviation of recently transacted prices for different make-model-variant-year cars from historical mean value of the same combination. Our pricing models are essentially trying to explain this deviation through all the available inspection and auction parameters.

Huge deviation of transacted price for any given make-model-variant-year from historical mean value of same combination

Any car being inspected is rated on ~150 attributes using our state of the art inspection app and we also collect~40 images per inspection (images with labelled rating - can imagine the spark in the eyes of computer vision enthusiasts reading this) — all this information is logged in our database and also made available as a concise inspection report to potential buyers during auctioning.

We started our price modelling journey with R based lasso regression models and eventually moved to python based advanced boosted trees (xGBoost) achieving high accuracy in our predictions, ~50% of transactions end up within +/-5% and ~95% within +/- 10% of ‘predicted vs actual’ price

Significant improvement in Pricing accuracy as we moved to feature rich xGBoost in Jan’19

Later this month, we will share a technical blog capturing details around our pricing models — stay tuned.

What Next?

Pricing is a critical component of our business which enables us be more aggressive/experimental with our business models while improving experience for buyers and sellers alike. We at CARS24 are on a mission to reshape the industry, address information asymmetries and make it a level playing field for everyone.

These are still very early stages for us and huge headroom left to improve our models solving this remarkably complex problem. We will continue to enhance our pricing models with more features and newer algorithms, e.g. deep learning models on images to augment parameters coming from inspection app.

Acknowledgement

I want to thank pricing data scientists Atish Jain and Mridul Arora for their tireless effort in bringing us to current level. Also, want to acknowledge contribution of Jeetesh Agrawal, Shashank Kumar and Prakhar Jain for their valuable inputs over the last 6 months. And special thanks to our Co-founder& COO Mehul and CTO Marut for their unwavering support and feedback throughout this project.

We are always seeking outstanding people to join our data science team working across complex problems like Pricing, Recommendation Engines, Auction Scheduling/Strategy, Marketing Optimization, Retail/Sales efficiency projects and Risk models for our lending vertical.

Please reach out to me directly at naresh.mehta@cars24.com or drop a mail to datascience@cars24.com or hiring@cars24.com for more details.

📝 Read this story later in Journal.

🗞 Wake up every Sunday morning to the week’s most noteworthy Tech stories, opinions, and news waiting in your inbox: Get the noteworthy newsletter >

--

--

Naresh Mehta
CARS24 Data Science Blog

VP, Data & Strategy @ Cars24 | Ex Zomato, ZS Associates, dunnhumby | IIT Madras