Boston Airbnb’s Data Analysis

Udacity Data Science Nanodegree Program Project

Francisca Alliende
Jun 17 · 12 min read


Data Understanding

The variable of interest: price per accommodate

Fig. 1. Price and price per accommodate histograms

Numerical Features

Fig. 2. Histograms numerical features
Fig. 3. Null value distribution in numerical variables

Categorical Features

Fig. 4. Null values distribution in categorical features
Fig. 5. Number of observations per neighbourhood
Fig. 6. Percentage of properties that offer the amenities

Data Preparation

Which are the most expensive neighbourhoods in Boston?

Fig. 7. Price per accommodate by neighbourhood

Is it possible to create clusters of the Boston Airbnb’s?

Fig. 8. Elbow method
Fig. 9. Price per accommodate per cluster
Fig. 10. Score per cluster in categorical variables
Fig. 11. Score per cluster of offered amenities

What are the factors that influence the price of Boston Airbnb?

Fig. 12. Random Forest results price per accommodate
Fig. 13. Random Forest results price per accommodate < 200
Fig. 14. Top 20 features by feature importances


