Using Python to Buy a Used Car

How I used Python to *almost* buy a used car

Nadya DeBeers
7 min readJan 28, 2023

· The inspiration
· The data
· Analysis with Python
· Visualisations
Car make vs. price
Car model vs. price
Car year vs. price
Car mileage vs. price
Car year vs. mileage
· The result
· Conclusion

The inspiration

A few months back my car transmission died and I suddenly found myself in the market to buy a used car. To make matters a bit more dramatic, my grandparents were coming to visit from overseas in 4 weeks time and I needed to quickly secure a way to drive them around for sightseeing purposes. I promptly hopped onto Facebook Marketplace, Gumtree, and the likes, and started bookmarking 100’s of used cars.

After what felt like days of staring at the computer screen during every spare moment of time, I found myself completely overwhelmed by the amount of options. I didn’t know how to approach the search efficiently and I felt like I was scrolling endlessly. That’s when I decided that something needed to change.

I thought to myself, if I could just compare the cars that I’ve been bookmarking with some type of visualisations, I would have a much better understanding of what a “good deal” or “good value” used car looked like. That’s when I realised I could apply some very basic Excel and Python in order to make my search easier.

The data

Ideally, I would have used web scraping in order to quickly scan my bookmarked cars. When I went to do this, I could not seem to scrape my Facebook Marketplace saved items url. For the sake of time, I decided to manually enter the details into an Excel spreadsheet. This process took about 15 minutes.

When bookmarking cars, I generally only saved cars that were:

  1. Newer than 2004
  2. Either Toyota or Hyundai make
  3. Less than $13,000
  4. Clean interior and exterior

The cars that were bookmarked had already passed my own needs and quality check, so for this reason, I did not need to record any data but the following: price ($AUD), year, mileage (kms), make, and model.

Here is a snapshot of the data:

Analysis with Python

In order to analyse and visualise the data, I used Pandas, Matplotlib, and Seaborn.

I started by creating a data frame of the data and doing a simple describe to understand the data holistically.

It appears that the average car from our dataset is a 2008 with ~144,000kms for about $8700. We can keep these averages in mind when looking through our data to understand if a car is above or below any of these parameters.

We can also combine minimum and maximum values to understand what the “ideal” car would look like from our set. The ideal car would be a 2014 with 35,000kms for $2000. This is obviously unrealistic, but we can keep these ideal values in the back of our mind when looking at our visualisations.

Visualisations

After looking at the data, I began creating visualisations on multiple variables in order to understand what a “good value” car looked like. Here are some of the visualisations that I found useful:

Car make vs. price

In this visual we are comparing the average price of cars per make, namely Toyota and Hyundai.

We have to keep in mind that this information is relative to the other data that is not present in this display, like year and mileage. However, this visual gives us a quick understanding of the prices that we can expect for the two different makes of cars in our dataset: on average, Hyundais will be more expensive than Toyotas.

Car model vs. price

In this visual we are comparing the average price of cars per model.

Again, we have to keep in mind that this information is relative to the other data that is not present in this display, like year and mileage. However, this visual gives us a quick understanding of the price that we can expect for each car model in our dataset: Prius’ are the cheapest models while i30s are the most expensive.

Car year vs. price

In this visual we can see the year of the cars versus the price, with the hue set to the car model.

As you can see, there is a bit of an upward trend which is to be expected as this means the newer the car, the higher the price. In this visual a good value car would be a newer car for a lesser price. For example, the 2011 i30 between $6000 - 8000 looks like it could be a good deal.

We can find this car in our dataset in order to see the missing details, like mileage, in order to understand if it is in fact good value. Here are the missing details about this car:

This car has 188,000kms on it which is higher than I would like, but given the price point and the year, it could be a good option.

Here is another version of the same information as a bar plot as opposed to a scatterplot:

In this visual we can see that I the models i30 and Yaris make up the majority of the newer cars in the data set. This could imply that the prices of the other models cars was too high for the newer cars and therefore, they didn’t make the cut when I was bookmarking.

The top amount that I wanted to spend on a used car was $10,000, so I made the top amount in my search $13,000 so that I could try to negotiate if I found a car that I liked at this price point.

Car mileage vs. price

In this visual we can see the mileage of the cars versus the price, with the hue set to the car year.

As you can see, there is a bit of a downward trend which is to be expected as this means the more mileage that the car has, the lower the price. In this visual a good value car would be a car with less mileage for a lesser price, as well as being relatively new. For example, the 2010 car around $8500 with about 70,000kms looks like it could be good value.

We can again find this car in our dataset in order to see the missing details in order to understand if it is in fact good value. Here are the missing details about this car:

We can see that this a Toyota Yaris. All the details of this car are favourable and lead me to believe that this is in fact good value and could be the right car for me.

Car year vs. mileage

In this visual we can see the year of the cars versus the mileage, with the hue set to price.

We would expect cars with low mileage that are newer to be the most expensive, so any cars of a lighter hue in the bottom right of the plot would be the best value. The 2008–2010 cars with less than 110,000kms for a price between $8000–10000 look to be good value.

Here are the car options that exist within these parameters:

The 2010 Yaris from above is appearing again. Since the same car has been seen twice in our good-value-visuals, I could see this being a good option.

The result

After cross-referencing all the best deals in a Venn diagram of sorts, I came to the conclusion that the following parameters make for the best value used car:

Price: $8000–10,000

Year: 2008 and above

Mileage: 120,000kms and below

Make: Yaris or i30

Model: Toyota or Hyundai

We can have a look at the cars from our dataset that meet these criteria:

At this point in my search, most of the above cars were off the market. However, rather than trying to purchase these exact cars, I was able to use the criteria when creating searches in Facebook Marketplace and Gumtree and feel confident in the fact that the cars would be good value.

Conclusion

After all this work, I felt secure in my analysis ready to go out and purchase a car! Once I found a few cars in my updated search using the above criteria, I was met with a problem that one could only face out in the real world… my mechanic who joined me in looking at these cars was able to identify big problems with each and every one of them! This proved to me that no matter the quality of the analysis and the effort you put into your models, the real world can always serve up something different and unexpected.

Suddenly struck by a new problem and still in a time crunch, I went rogue. I found a car at a dealership that didn’t meet any of my criteria, but provided me the assurance that the car was faultless. In the end I decided that the security of warranty and buying from a dealership was worth a higher price tag.

Although the results of my analysis proved useless in my experience, I found the application of simple Excel and Python in this used car search to be such a good use case for using something that we learn in a classroom out in the real world. I hope that this article will encourage someone else to apply some simple Excel and Python to another real world situation.

--

--