Leveraging Dwell Time Models to Create Dynamic Vehicle Services (Overview + Car Maintenance Service)

Jocelyn Wang
99P Labs
Published in
19 min readMar 23, 2022

Written by the MSBA Capstone Team, in partnership with 99P Labs: Tee Lakkhananukun, Prak Pola, Rebecca Stevens, and Jocelyn Wang

The MSBA Capstone Team is a part of the Masters of Science in Business Analytics program at Carnegie Mellon University.

Table of Contents

Project Overview

Our team had the opportunity to partner with 99P Labs, a research group focused on coming up with innovative features, concepts, services and designs to remain on the cutting edge of the mobility industry and potentially transform the landscape of transportation. In a recent related project, a team of Berkeley students built a model to predict the dwell time and location of vehicles based on car sensor and infrastructure data from a leading automobile manufacturer. In the problem we’ve been given, the main question we want to address is: If we are able to predict the dwell time and location of our vehicles, what business opportunities can we leverage with this information?

Project Framework

We used the CRISP-DM (Cross Industry Standard Process for Data Mining) framework as a guide as we worked through the project. Within this framework are six consecutive processes that we will discuss in the following sections (see figure below): Business Understanding, Data Understanding, Data Preparation, Modeling, Validation and Deployment.

CRISP-DM Framework

Business Understanding

With the Telematics dataset, we want to explore the use of predictive models to add business value. As a starting point, we brainstormed business ideas that could leverage location and dwell time predictive models, which will be covered in more detail in the next section. One idea in particular is to provide vehicle maintenance services, which we believe aligns well with the 99P Labs’ existing capability to offer automotive services through car dealerships. This would offer a convenient way for customers to receive routine vehicle maintenance both at home and in densely populated areas (e.g. work, grocery store, shopping center, etc.). Not only can car sensor data provide real-time feedback to the company, but dwell time derived from features in the dataset and location can also provide an estimated timeframe and optimal location for the service provider.

While we had the previously-built predictive models at our disposal, these dwell time and location models were geofenced to the Ohio region and were built without a focus on deriving feature importance and model interpretability. Without these kinds of model insights, it is difficult to provide clear and actionable recommendations that are valuable to the business. Therefore, another objective of ours is to identify important features that contribute to location and dwell time results. Lastly, we want to develop a framework that 99P Labs can use to validate models and determine production readiness. We hope to build out this framework more fully in the second half of our project, which will be covered in a subsequent blog post.

Background Research and Understanding the Competition
Before we develop a strategy, we need to first understand the services already available on the market by direct competitors. An OEM currently offers a wide array of at-home maintenance services and uses remote diagnostics to pre-diagnose repairs. The services and vehicle overview are all available to their customers through an integrated app, which allows the OEM to send service reminders, product recommendations, and other useful information. While the car is still under warranty, there is no charge for routine repairs. Outside the warranty, the OEM charges a labor rate of $150/hr on average.

There are other convenient services. Through a website, customers can schedule general maintenance for nearly every part in their car. They offer services for many car brands and makes/models. When you select a service, additional recommended services are offered that might also be required, along with the recommended service intervals. They then generate an estimated service time and allow customers to schedule the repair, at which point they pair customers with mechanics based on proximity. This service is very versatile, but requires a lot of user input. We can improve on this model by automating services, providing subscription services, and fine-tuning product recommendations based on consumer and car history. Prediction models can reduce the number of required interactions from customers, provide push strategies, and help service providers to become more proactive in reaching out to customers.

Top ten average auto mechanic salaries, by state.

Driven away from dealer-owned repair shops due to high prices, consumers often turn to local repair shops for routine repairs. Most independent auto repair shops charge a labor rate of $80–90 per hour as compared to the dealership average of $85–125 per hour.

The figure above shows the top ten average auto mechanic salaries by state. We will need to make up for this price difference by refining our product recommendations, add-ons, and capitalizing on the convenience of the services we offer.

Proposed Services and Benefits to the Manufacturer and its Customers
The package we envision as a solution will involve a slew of vehicle maintenance services. The idea is to add on to the existing suite of services offered through the Car Connectivity App, with features such as vehicle notifications and dashboards. We can leverage the predicted dwell times generated by our models to offer personalized services when our customers are home. We plan on basing the logistics of the service for vans and auto mechanics on an OEM’s at-home maintenance model. There are several reasons why this offer will appeal to our customers, as well as why it will benefit 99P Labs itself:

  • Consumers need not worry about waiting at repair shops or needing to be picked up if the repairs are extensive. Together with the predicted dwell time and location, the manufacturer can offer services that fit into the estimated service timeframe and location that is optimal for maintenance service. We hypothesize that customers will appreciate zero wait times for car services.
  • 99P Labs can leverage models for a push marketing strategy to attract customers and reach a larger customer base.
  • Using car sensor data and service history, the car manufacturer can plan pre-diagnosed repairs and order parts in advance; this will allow consumers to avoid lengthy lead times on required parts.
  • This service is great for increasing brand-loyalty and discourages customers from having repairs done at third-party garages. This in turn will allow the manufacturer to promote additional in-store products for purchase when the customer schedules a servicing appointment.
  • By maintaining a more detailed service history on the vehicle, 99P Labs can offer more advanced services that would require customers to bring their car to a dealership — this can include customer drop-offs and scheduling reminders.
  • The team recommends that 99P Labs invest in developing an app to integrate available services and general info for consumer reference or add onto the existing Car Connectivity App. This platform can also be used for season promotions on tires and add-on equipment.

By offering a customer-focused service experience, 99P Labs can open up numerous opportunities for additional product placement and advertising. This leads into the pricing aspect of the business plan.

Pricing Plan and Business Opportunities
There are several hypothesis when it comes to charging for these maintenance services. The most basic plan involves charging customers on a case-by-case basis. We can recommend a service through the Car Connectivity App, offer price overviews, and allow customers to schedule appointments. We can then offer discounts based on customer driving history and a simple points and rewards system based on previous purchases. It’s possible to partner with insurance firms as well to exchange data on driving patterns as observed by the car sensors, allowing them to adjust rates accordingly. Safe driving can be rewarded with discounts on routine servicing.

We also have the offering hypothesis involving subscription services. This can be for seasonal maintenance, routine checkups, and oil and fuel replacement. Customers can receive discounted rates by subscribing to yearly packages with different maintenance scopes as opposed to scheduling one-time visits. We can use the app to send reminders for upcoming appointments and status updates. By using our dwell time data, we can plan these services well in advance, allowing 99P Labs to optimize the use of their manpower and technician availability. It will also allow service centers to maintain more accurate margins on their part orders, since they will have a detailed record of previous and upcoming maintenance activities.

Finally, we can promote additional in-house products through the app, around the time of the scheduled visits as add-ons. This discourages customers from finding cheaper alternatives at third parties and gives the customer a more user-friendly experience by providing a one-stop shop for everything they need.

Data Understanding

Telematics Dataset via the 99P Labs Portal
Several datasets are available via the 99P Labs data portal. Our focus in this project is on the Telematics dataset, which consists of over 25 million rows and 287 features. Car sensor data was collected in real-time — when an event happens, the corresponding sensors will output data and a row gets inserted into the dataset. Not all sensors will generate a signal every time, therefore we are left with a fairly sparse dataset. The challenge for us was to find clever ways to aggregate each feature variable to create insightful predictive models. Due to the long download time, we retrieved a subset of 25 million rows from the entire dataset. Within this dataset, we found that:

  • There are 5129 unique vehicles.
  • Many columns comprise almost all null values or contain highly imbalanced classes for categorical data.
  • There are missing consecutive sequences for each car. This may affect location prediction accuracy.
  • Many data tables (groups of sensor data) will not have a high impact on the predictive models (e.g. Media, Satellite). We decided to exclude these tables.
  • Some features such as average temperature may be powerful in determining dwell time, but have too many missing values. An alternative approach to this is to get historical temperature data.
  • There are other interesting features that we can explore as a target variable, depending on the business use case. For example, an aggregated feature from the Diagnostic table can be used for car maintenance services.

The data represents 2 weeks of car sensor events. We know that any models built based on this limited time frame will not generalize well to new data. Driving behavior may change, points of interest may shift, or seasonality may not be captured in the data. Nevertheless, the size of the data itself proves to be a challenge for our team because we had limited experience dealing with big data.

Survey Data
We lacked some confidence in the available datasets to fully support our business use case (car maintenance service), so we conducted a survey to validate our ideas and understand customers’ patterns. The collected survey results can also be used for marketing purposes. The details of this survey are covered in a later section.

Data Preparation

Telematics Dataset
Due to the large size of the data and our limited disk storage (RAM), it is almost impossible to load the Telematics dataset using the widely-used Python library, Pandas. As an alternative, we explored using Dask, a library based on Pandas, to work with a dataset this size. In the end, we leveraged PySpark, a Python interface to the well-known Apache Spark, to work with the data as made convenient with the computing resources provided by 99P Labs. PySpark is suitable for big datasets due to its ability to distribute the data processing to a cluster of computers. Furthermore, its lazy loading feature, which loads the data partially, allows us to process the data without having to load the entire dataset into memory.

In the first half of the project, we focused on building a model for dwell time prediction and to identify important features. We grouped the data by vehicle id and Sequence to represent individual trips, which allows us to cut down the number of rows considerably in the dataset. On average, there are 43 trips per car and with 5129 cars, so we can expect around 220k remaining rows. We extracted time-based features, location features and other aggregated features based on assumptions that they will give high predictive power. A table of key features extracted is provided below for reference.

Cleaned Model Features

Location Clustering
Directly feeding latitude and longitude into predictive models may not provide meaningful input to predictive models. We followed the Berkeley team’s approach by clustering locations using unsupervised learning, but with a significant number of additional features.

The most common algorithm to start with is K-means, but selecting the number of clusters (K) would be difficult without the proper domain knowledge to make an educated guess. K-means is a distanced-based clustering method, and the measure will converge to a constant value between any given example as the number of features increases.

We determined that a better approach would be to use DBScan, which clusters data points based on data points’ density and is insensitive to outliers. Another advantage to this method is that we don’t have to select a number of clusters. However, two hyperparameters must be tuned:

  • epsilon: the maximum distance between two samples for one to be considered in the neighborhood of the other
  • number of minimum samples: the number of samples (or total weight) in a neighborhood for a point to be considered as a core point

Careful selection of the hyperparameters is therefore important for the resulting clusters. We included dwell time, drive duration, weekend, description of time, longitude, and latitude as inputs to the clustering algorithm. The output from DBScan was 612 clusters. One downside to this approach is that interpreting the clusters is difficult. We ultimately appended the clusters to the main dataset as an additional feature, learned through this unsupervised learning exercise.

Survey Validation

We decided to conduct consumer research through a survey medium and ask basic questions regarding vehicle and service/repair patterns. We wanted to use this survey as a way to validate some of the trends we were seeing in the Telematics dataset, as well as some of the assumptions we were making for the business plan. We distributed the survey among our own networks, both professional and personal. We had a total of 190 people take the survey, though only 162 completed the survey in its entirety. Out of the 162 respondents, 154 said they had at least one vehicle in their household. This pool of 154 respondents became our sample for analysis. We discuss some notable Tableau analysis results below.

Respondent Demographics
This sample pool was fairly evenly distributed throughout the different demographics, though it was more heavily weighted in the age category of 25–34 years old (58% of respondents). This is likely due to the fact that our team members are part of this age bracket and, therefore, the majority of our networks are as well. Though we would have liked to see a smoother distribution between all ages, we ultimately were satisfied with the results as we had such a successful response rate overall, and this age bracket is also our biggest focus for our target market. In addition, our responses were mostly distributed between those who reside in suburban or urban areas (94%); very few respondents said that they reside in rural areas. This was favorable for us, as we want to target those who reside in more heavily populated areas.

Vehicle and Maintenance Behavior
As we assumed, the majority of repairs people have done on their vehicles are considered routine maintenance (i.e. oil change, tire rotations, vehicle inspections, etc.). For this routine maintenance, respondents are fairly split between whether they bring their vehicle to an independent garage or their franchise dealership. This is definitely something that we would need to consider in our business plan development and could be considered an obstacle, as we would need to consider ways to entice people to choose their franchise dealership for maintenance instead — in this case, our partnering car manufacturer.

Driving Patterns
We asked the respondents questions about the driving patterns, for example where are the most common places they travel to, as well as how far they travel to get there and how long they dwell at these locations. These questions served as a great way to validate some of the trends we saw in the Telematics datasets. For example, nearly 70% of people are making primarily short distance trips (0–19 mile radius). Some of the most commonly visited locations include the grocery store, work, restaurants, and other stores/shopping centers which all were chosen about 50% or more of the time. In addition, we typically saw that when people travel longer distances, they are much more likely to dwell for a longer period of time. This suggests that distance traveled and dwell time are positively correlated.

Pricing Scenarios
The last important topic of questions we asked in the survey was centered around a couple of different pricing scenarios. These questions were posed in order to get a feel for what kinds of services respondents were interested in, and what type of pay schedules they were willing to subscribe to.

The first question posed the hypothetical scenario of getting maintenance done at a vehicle owner’s local garage, consisting of 3 hours of labor and costing $200. We wanted to know if they would be willing to pay a surcharge in order to have the technician come to their home or other convenient location such as work, to complete the maintenance there. If they were interested in this service, we asked further how much of a surcharge would they be willing to pay. Out of the 154 respondents, a total of 59% said they would use this service and 44% said they would pay a $50 surcharge.

The second scenario asked if they would be willing to pay a yearly membership fee of $200 that would cover regular routine maintenance such as oil changes, tire rotations, vehicle inspections, etc. Out of the 154 respondents, 76% said they would use this service and 50% said they would pay $200. What was interesting to find was that if we filtered the results to include only those aged 18–44 years old, excluding those not in our target market, the results improved (pictured in the image below). When looking at the first scenario, (on the left), that offered a pay-per-service surcharge example, those that were interested in the service increased to 66%. The second scenario (on the right) that offered a once a year membership fee for routine maintenance increased to 81% of people interested in the service. At the end of the day we learned that people are much more likely to subscribe to a once a year fee over a pay-per-service surcharge payment schedule.

Scenario #1: pay-per-service surcharge (left); Scenario #2: once a year membership fee for routine maintenance
Survey results for pricing scenario questions

Modeling Methodology

One of our main tasks was to identify which features in the Telematics dataset are most significant in determining a vehicle’s dwell time. As a starting point, we decided to follow the dwell time model developed by the Berkeley team. However, rather than using only location data and a daylight variable, we included as many additional features from the dataset as we had post-data processing. Among these 45 selected features were the location clusters calculated above, time and duration variables, diagnostic warning variables, and several others. We performed multi-class classification, dividing the target variable (dwell time) into the three buckets defined in the existing dwell time model: 0–3 hours, 3–6 hours, and 6+hours. It is worthwhile to acknowledge that, although the intuition behind defining the target classes in this way (by the previous team) was unclear to us, we decided to move forward with this decision in order to serve as an initial benchmark against which to compare our own models. We evaluated both linear classification and tree-based methods, in particular Support Vector Machines, K-Nearest Neighbors, Random Forest, and eXtreme Gradient Boosting.

Linear Models vs. Tree-Based Classifiers
In reference to the infographic below, we focused our attention in the classification space and tested a variety of classifiers on the dataset. Linear models tend to be preferable when working with datasets containing fewer observations, whereas tree-based methods are highly scalable and work well with extremely large datasets. While the original dataset consisted of 25M rows, our resulting dataset after cleaning and processing included only 85K rows. As a result, we decided to evaluate both linear and tree-based methods in our analysis.

Scikit-learn Machine Learning Algorithm Flowchart (Source: https://scikit-learn.org/stable/tutorial/machine_learning_map/)
Scikit-learn Machine Learning Algorithm Flowchart (Source: https://scikit-learn.org/stable/tutorial/machine_learning_map/)

Models like Support Vector Classifiers (SVC) and K-Nearest Neighbors (KNN) are linear separators in that they determine a boundary line or hyperplane to distinguish between classes. Because the problem we’ve defined involves more than 2 distinct classes, our methodology must allow us to map the data into a higher dimension. Although not all linear classifiers are capable of capturing non-linear relationships in this way, KNN does this natively while SVC achieves this by bringing in a kernel function, making them both preferable for our purposes. Furthermore, linear classifiers are often selected for their interpretability due to the availability of model coefficients.

Random Forests (RF) and eXtreme Gradient Boosters (XGB) are examples of tree-based models. These models are highly robust and can capture nonlinear relationships well. In addition to this, they are extremely effective in learning complex relationships that exist in high-dimensional data. The cost of this is that they are harder to interpret and can easily overfit to the data.

Model Evaluation

K-Nearest Neighbor Classification
We trained a K-Nearest Neighbor Classifier, which attempts to classify the test data based on distance metrics, specifically the k-nearest points to each test point. In order to determine the optimal k-value for maximal accuracy, we first evaluated a wide range of values with an interval of 5 to narrow our search, before evaluating the model at a more granular level to find the absolute best k-value. The k-value that achieved the best model was k=47, with an accuracy of 64.0%.

Support Vector Classification
While SVCs are often effective in high-dimensional spaces, it is difficult to tune for optimal hyperparameters. In an attempt to circumvent this, we performed a grid search on the hyperparameter space for the Support Vector Classification model. The optimal model used a Radial Basis Function (rbf) for the kernel function. This gave a final accuracy score of 62.4%.

Random Forest Classification
We first ran a naive random forest model using the default hyperparameters provided by the Scikit-Learn package. This achieved an initial accuracy of 64.1%. After tuning the model for optimal hyperparameters using RandomizedSearchCV, the model accuracy increased to 77%.

XGBoost Classification (Best Model)
Our best results were achieved using eXtreme Gradient Boosting. XGBoosting is a decision tree classifier that employs a gradient descent algorithm, an iterative process used to minimize loss. It is a model that attempts to learn as it goes, correcting past mistakes to improve its performance. When used for classification tasks, XGBoosting methods are often successful in achieving high accuracy and low bias with low computation time. We were able to validate this hypothesis with a test accuracy of 79.8%, which is a 2.2% improvement over the previous 3-class dwell time classification model.

Comparison of Model Results

Feature Importance via SHAP
While our tree-based models outperformed the linear models significantly, these methods suffer from being black boxes, making them difficult to explain and interpret. In order to circumvent this drawback, we leveraged SHapley Additive exPlanations, otherwise known as SHAP, to determine feature importance. The concept of SHAP is based upon game theory and machine learning, using the outcome and features of the model as its ‘game’ and ‘players’ to “quantify the contribution that each feature brings to the prediction made by the model” (Source). This methodology focuses on local interpretability, meaning that it observes the model outcomes at a per-observation level. Aggregating these values allows us to evaluate feature significance at various levels of granularity.

At the full-dataset level, the Shapley values suggest that there are a number of mutually significant features between the XGBoost and Random Forest models. In particular, location variables such as the DBScan-predicted clusters and latitude/longitude have the greatest impact on the dwell time.

Feature Importance Rankings (variables that are mutually important are highlighted in blue)

Drilling down into one target class allows us to view the feature importance in more detail. The following visualization can be read as follows:

  • Order: The variables are listed in order of significance.
  • Spread: The points on the right side of the y-axis are the observations that represent a higher prediction. In this case, higher SHAP values represent classification within the 0–3 hours bucket.
  • Color: Red dots indicate a higher original value for the variable, whereas blue dots indicate a lower original value.
  • Correlation: When the spread for a variable is red on the right side of the y-axis, the variable is positively correlated with the target variable; similarly, if a variable is blue on the right side of the y-axis and red on the left, the variable is negatively correlated with the target variable.
SHAP Variable Importance Plot (Dwell Time: 0–3 hours)

Initial takeaways when observing feature importance for the target class of 0–3 hours of dwell time:

  • Location clusters is the most significant variable.
  • Drive duration is another important variable, validating our proposed hypothesis that the longer the drive duration of a trip, the longer the dwell time at the final destination will be.
  • The higher the tire pressure of the vehicle, the more likely dwell time will be between 0–3 hours.
  • If the time of day is early (5–8AM), the dwell time is less likely to be between 0–3 hours.

Drawbacks of Chosen Dwell Time Buckets
Our best-performing XGB model provided great results in terms of overall test accuracy. However, digging deeper into the results quickly revealed that this model failed to be ideal. The distribution of our target variable shows that a large majority of the observations had short dwell times, between 0 and 3 hours, whereas vehicles that dwelled for 3–6 hours comprised a meager fraction of the dataset. As expected with class imbalance, the confusion matrix for this XGB model shows that the high accuracy achieved is mainly attributed to the fact that our model correctly predicted observations in the 0–3 hour class.

Conclusion & Next Steps

In the second half of our project, we intend to expand upon the work we’ve done so far, as well as explore an additional use case for the Telematics dataset. One of our main goals will be to conduct an extensive failure mode analysis on our business models to understand the impact of incorrectly predicting dwell times and locations. This will allow us to implement a robust mitigation plan to minimize the risk to both our customers and the consumers using these services. We also aim to expand our feature engineering to mimic customers’ inputs on dwell times and explore different time buckets for dwell time prediction. Finally, we plan on developing a validation framework in order to successfully implement both our models and our business plans in the real world, and explore the challenges of doing so.

Acknowledgements

We want to thank our advisor, Neda Mirzaeian, for facilitating a smooth and enjoyable project experience, as well as our project sponsors, Rajeev Chhajer and Tony Fontana of 99P Labs, for providing support and direction along the way.

--

--