VroomVroom: Sell Your Car With Data

Sushant Gadgil
5 min readMay 8, 2022
VroomVroom’s Hypothetical Logo

Using data collected from over 3 million car sales, VroomVroom allows car owners to get the best price for their vehicle in a time period that works for them!

A screenshot of VroomVroom’s UI

Introduction

Selling your car can be a daunting task. A maze of dealers, brokers, and websites purport to give you the best price on your vehicle, only to turn around and resell it for 30, 40 or even 50% more. Multiple online websites give you values on your car, purporting to be the single source of truth, but whose valuations were written based on data from decades ago, written in a little blue book and sold to dealers to evaluate cars. While buying a car usually entails a trove of information, sellers rarely get the ability to fight for the value of their vehicles. As a final project in data analysis for decision making, part of the Systems Engineering graduate course SYSEN 5160 at Cornell University, VroomVroom decided to use data to change that.

5 Second Car Description

Using data from over 3 million car sales published as an open source dataset on Kaggle [1], we have developed an algorithm that tailors historical data to a user with minimal inputs. The tool will prompt the user for the Make, Model, Year, and Mileage of their vehicle and will ask them to confirm their zip code.

With this 5 second input, the algorithm will pull an example image of the vehicle and prompt the user to confirm that the image matches their type of car. Behind the scenes we will also confirm the geographic location of the input zip code and throw a warning to the user if they’ve input an invalid zip code.

Once all information is confirmed to be correct, the tool will be ready to calculate the best sale posting to meet the user’s preferences.

Defining a Successful Sale

Not everyone has the same priorities when it comes to how they want to sell their car. You may want to sell your old Toyota Camry as quickly as possible and care most about getting a fast sale. Your neighbor, on the other hand, may be happy leaving their car on the street a few extra months if it means being able to sell their car for a higher price. To accommodate the range of user preferences, we’ve built in buttons to allow selection of the user’s preference for the sale.

Let’s take the above example of a 2019 Subaru Forester. To get the best price, we would select ‘Get Best Price’ as prompted.

Our algorithm would then expand the search radius to 50 miles around your input zip code in order to model the greatest chance of selling your car at a high price. In this case, the code suggests we list the car for $37,491 and post the listing in July.

Now, let’s say it’s May and we want to sell the car as quickly as possible. For this we select ‘Fast Sale’.

To best model a realistic car price for a fast sale, our algorithm will then reduce the search radius to 30 miles around your input zip code and will filter out training data to look at sales only from the current for proceeding 2 months. This will help find the most realistic price estimate for the vehicle to sell within a short timeframe.

In our example, the algorithm will reduce the suggested listing price to $35,908 and move up the listing date to June.

Results

The resulting project developed is able to successfully confirm the geographic location the user has input and return accurate images of their vehicle by Make, Model, and Year. The algorithm then can quickly run linear regression on their input sale data and return a suggested listing price and listing month every time.

Next steps to further improve the project will be to test and make improvements to the accuracy of the model. Some features that could be added would be to include the opportunity for users to input more details about their vehicle, like paint color or vehicle condition.

On reviewing additional used car data from a dataset scraped from Craigslist [2], paint color seems it may be a strong indicator of expected price listing for vehicle colors like gray or green.

The algorithm even has potential to be developed to return details such as what language a person should use to describe the condition of their vehicle in their posting. Reviewing data from the same Craigslist dataset [2], minimal variations in how the condition of a vehicle is described, like an ‘excellent’ versus ‘like new’ condition may vary the price the user could expect for the sale.

Try out the model next time you’re selling your used car and let us know what you think!

References

[1] Kaggle US Used Car Dataset: https://www.kaggle.com/datasets/ananaymital/us-used-cars-dataset

[2] Craigslist Dataset:

https://www.kaggle.com/austinreese/craigslist-carstrucks-data

--

--