Sidewalk Riding Detection at Lime

Yi Su
Lime Engineering
Published in
7 min readApr 17, 2020

People around the world have turned to scooter companies like Lime for their daily transportation needs. As a result, there has been an increased combination of scooter riders and pedestrians on sidewalks, and Lime has been proactive in working with consumers and community partners to improve micromobility infrastructure for safer streets and the safety of all of our riders and residents in cities where we operate. One of the ways we’re working on this is developing technology for detecting sidewalk riding behavior. In this blog post, we will provide a glimpse into how Lime is leveraging machine learning to address this problem.

Data Source Selection

Our first challenge: figure out the correct data source for our detection model. We evaluated the three most practical options: GPS modules, cameras, and vehicle sensors. Each had pros and cons, based on factors like feasibility of installation, cost, and scalability of implementation.

1. GPS modules

Lime vehicles are equipped with integrated GPS modules that continually report realtime geolocation information to our servers. Theoretically, we would be able to check our vehicles’ realtime locations against a preprogrammed list of sidewalk locations.

  • Pro — Readily available. GPS is preinstalled on both the scooter and the rider’s smartphone and comes with no additional hardware cost.
  • Con — Inaccuracy. GPS signals from satellites are easily disrupted by large structures such as tall buildings, which are unfortunately quite widespread in the dense urban environments where Lime operates. This can increase the margin of error by a few meters, which is unacceptable for us since sidewalks are sometimes only a few meters wide.

2. Cameras

The second option we considered was cameras. We explored the possibility of installing cameras on the front of the scooter to detect the state of the vehicle using images of the environment, similar to how cameras are used for self-driving navigation.

  • Pro — Leveraging computer vision models. We found that we could leverage the advancement in the field of computer vision, e.g. object localization, semantic segmentation, etc.
  • Con — High cost. While this approach may be able to achieve excellent results in a research lab, the storing, transferring and processing of image data are too expensive for efficient scalability. Furthermore, the camera adds significant cost to the hardware and is susceptible to damage and vandalism.

3. Vehicle Sensors

Lime has sensor hardware preinstalled in the Lime scooter, which makes this option readily available.

  • Pro — Readily available. This comes with no additional hardware cost, and the size of the sensor data is more manageable than image data because it’s less expensive to store, transfer and process.
  • Con — Lack of rich features. The only con we found for this option was that the data or features may not provide as much context as an image would, but the data we could get from sensors was still enough to be able to detect a sidewalk versus a road.

After assessment, we decided to go with vehicle sensors due its feasibility and scalability. The GPS data was too inaccurate and the camera and computer vision is not scalable with the million of trips taken on the Lime platform on a weekly basis.

Data Exploration and Model Intuition

Sensor Data Elements

There are two types of sensor data we used to build our models: speed and acceleration. Below are plots of the acceleration time series of acceleration in all three directions as well the scooter speed. We searched for the signals conveyed in the amplitude and frequency of such time series data, hoping that they could become indicators of sidewalk riding status. The speed is in mile/hour, and accelerations are in g/s, where g is the gravitational constant.

Sensor Data

Specifically, the coordinate system of the sensor data is:

  • g-x horizontal right
  • g-y vertical up
  • g-z horizontal forward

Predicting sidewalk riding

We’ve found that the user experience differs between someone riding on a sidewalk versus riding on a road or bike lane. On a sidewalk, a rider may turn handles (affecting the horizontal right/left acceleration) or brake (affecting the horizontal forward/back acceleration) to dodge pedestrians more often than riding on the road.

Additionally, many sidewalks are constructed using small bricks or concrete slabs of relatively equal size, whereas roads are typically made of construction aggregate mixed with bitumen. Therefore, the vibration patterns (affecting the vertical acceleration) are often different when riding on sidewalks vs roads.

The riding behavior can be revealed by the acceleration in x and z direction, and the texture of the road/sidewalk can be revealed by the acceleration of y direction.

Feature Extraction and Engineering

The data is naturally a time series. However, as the speed decreases, the information about the rider’s behavior and the texture of the surface also decreases. In an extreme case, when the rider completely stops, the data would not be able to tell us whether the rider is on the sidewalk or not. Because of this, we resegment the data by intervals of equal distance.

The next step in the process is extracting different features from the dataset. For example, intuitively the vibration patterns would detect the difference of textures between sidewalk and road. More specifically, because the bricks/tiles used to construct the sidewalk are roughly the same, the distance between adjacent peaks/troughs of the vibration data on the sidewalk would be more consistent on the sidewalk compared to on the road. The feature importance analysis confirms our intuition as well.

Peaks/Troughs of Vibration Data of a Sidewalk Segment

Model Construction

We use a logistic regression to create a baseline, and then see whether more sophisticated models give a performance boost.

The following table is the feature importance of the baseline logistic regression model, where g-y_peak_dist_avg is the average spatial distance of peak occurrence of vertical vibration as described in the last section. Additionally, z_diff is the derivative of the horizontal forward acceleration, which captures the riders’ instant actions of pressing the brake or throttle. This analysis confirms our intuition.

Feature Importance

Under the impression that a more complex algorithm would provide a performance boost, we tried to use convolutional neural network (CNN) and recurrent neural network (RNN) models on these features. Contrary to our expectation, these more complex models did not produce a significant improvement compared to the logistic regression. One hypothesis for this is that deep learning algorithms perform better when there is a large amount of data, and our current dataset is still small. Furthermore, one drawback of these models is their lack of interpretability compared with the simpler logistic regression.

On the other hand, from a practical point of view, ideally this model will eventually be deployed on the scooter CCU. However, a more complex model requires more computational power from an already resource constrained scooter CCU. With all of this in mind, we chose the logistic regression model in our production.

System Architecture

So, how does this work? A rider starts a trip using his/her device. The scooter sensor starts to record trip data. At the end of the trip, the sensor data is sent back to the backend database. Offline, we use these raw data to extract appropriate features explained in the last section, and train a sidewalk detection model. When it is ready, it is being deployed to ML Server. Online, the post trip sensor data is queried from the database, and sent to ML server for prediction. After the prediction, the result will be sent to the user device to power in-app features, and it will also be recorded in the database for further analysis to improve the features and models.

Note that the model is deployed in a separate backend server. In a future version, the model may be deployed to the scooter CCU to reduce the latency.

Sidewalk Detection ML System

Pilot Launch

Lime deployed the sidewalk detection technology to a few hundred scooters in San Jose in late January of 2020. Right after riders end a trip, the trip sensor data is analyzed. If the ML model finds the rider stays on the sidewalk for more than 50% of their trip, they receive a push notification that kindly reminds them not to ride on the sidewalk. Through A/B testing, we showed that it was able to deter the sidewalk riding behavior. The insights this technology generates can help city governments better understand where the sidewalk riding happens and how often to inform cities on local infrastructure needs.

*Note: The ML model prediction on the upper left corner is joined with the video data for illustrative purposes. The video data is not used in the actual prediction.

Moving Forward

There are three directions for the future of the project:

  1. Model Advancement: Currently, we only have a few hours of sensor data. More data can be collected and used to unlock the full potential of the deep learning models.
  2. Lower Latency Prediction: The model can be deployed to the scooter CCU to achieve closer to real time prediction. This will enhance the user experience, but also requires higher model performance.
  3. Scale Across More Scooter Models: Currently the ML model is only deployed for certain scooter models. It would be natural to extend this technology for more scooter models.

Acknowledgements

This project would not be able to complete without the dedication and creativity of our fellow Lime engineers. Special thanks to Kunal Bansal, Jianfeng Hu, John Pena, Jinsong Tan, Jack Zhang, Wubai Zhou and Yuxin Zhai for their contributions in making this possible.

--

--