Photo by Danata on Morguefile

Encouraging the use of public transport with Big Data

Felipe Monroy
Trends in Data Science
11 min readMay 21, 2020

--

Cities are becoming predilect locations to live in almost every country. In 2018, 55% of the global population was living in urban areas, and it is predicted that by 2050 at least 68% of people live in cities (United Nations, 2019). This increase is due to the convenience that cities give to their inhabitants, such as better economic development and services than rural areas. According to Cohen (2006), cities are areas of employment opportunities, but also the core of modern life and cultural diversity. Figure 1 shows the increase in population density until 2014 and the prediction to 2030.

Figure 1: Population density calculated using global city population estimates (United Nations, 2014)

While the growth of cities could be seen as an opportunity, it has some challenges associated. Increasing movement via motorized transport modes is one of the problems of cities, which causes traffic congestion, accidents, noise, land take, energy use, air pollution, and carbon dioxide (CO2) emissions (Poumanyvong, Kaneko, & Dhakal, 2012). These consequences get worse if people tend to use private transportation, increasing the number of vehicles in the cities. In Figure 2, a comparison between the USA and Germany is shown, where Germany uses almost four times more green modes (public transport, bicycle, and foot) than the USA in percentage (Buehler, 2011), reflecting less energy consumption and accidents.

Figure 2: Transportation comparison between the USA and Germany (Buehler, 2011)

Public transport appears as one of the solutions to the problem of increasing vehicle traffic in cities; however, there are some challenges associated with boosting public transport. Nowadays, people tend to be more dependent on car travel due to its advantages, such as freedom, flexibility, and time efficiency (Hagman, 2003). Therefore, one way to improve the public transport usage is to plan and operate a level of service that satisfied potential customers, giving them a service that, over the balance, is better than private cars (Beirão & Cabral, 2007).

The next section introduces some public transport barriers, in particular, those related to transport planning and operation. Then, some ideas for improvement using Big Data are presented.

Barriers in public transport

According to Beirão & Cabral (2007), some factors of public transport are perceived as barriers to use it. Moreover, these factors depend on the type of journey, which defines the level of service that the customer requires. In Table 1, some motivations and barriers concerning public transport use are shown. The majority of these barriers belong to two categories, planning and operation of public transport.

Table 1: Motivations and barriers to public transport use (Beirão & Cabral, 2007)

Concerns in transport surveys

Some of the barriers listed by Beirão & Cabral (2007) are strictly related to public transport planning. For example, lack of direct transport, long travel time, and having to use more than one transport are problems that occur because the transport planners concluded that the transport routes and frequencies were adequate to satisfy the majority of people. They use the information of surveys, like the household travel survey (HTS) or the individual travel survey (ITS), for planning the routes and setting the frequencies of public transport; however, these surveys have some problems associated.

Classic travel surveys have always struggled with three questions (Bonnel & Munizaga, 2018):

  • Increasing survey participation levels without changing travel behavior
  • Representativeness in surveys
  • Immobility and survey non-response

Stopher & Greaves (2007) give more details about these barriers. For instance, he says that the number of surveys has decreased over the years, reaching levels below 1% of the total population. Increasing non-response bias is another problem that was studied by Bradley, Greene, Spitz, Coogan, & McGuckin (2018), especially concerning young adults’ non-response, which appears to be related to the method of data collection. On the other hand, conventional survey methods (such as telephone surveys) have a potentially a threat to the representativeness because of how technology is evolving (Stopher Peter R., 2009).

Barriers in the operation of public transport

According to the information in Table 1, some people prefer to use private vehicles because there are barriers related to the operation of public transportation, such as buses’ unreliability, do not know what to expect, feeling of personal insecurity, and inadequate information. All these barries take place in the three-stage journey; pre-trip, wayside, and onboard (Grotenhuis, Wiegmans, & Rietveld, 2007).

Inadequate information is a critical problem in the pre-trip stage when people are planning the trip. Later, unreliability is present in the wayside stage, when people are waiting for the bus or train. Finally, during the trip, while reliability is still a need, personal security becomes one of the priorities. The following analysis is focused on these three barriers:

  • Give better information to plan the trip
  • Reliable transportation: Reducing delays
  • Improve feeling of security during the trip

The first barrier is when people are planning their journeys, according to Beirão & Cabral (2007), potential passengers decided to use their cars because they do not have information regarding routes and timetables of public transport. In the second stage, because any delay reduces the attractiveness of public transportation, reliable is what most users want (Schachtebeck, 2009). During the trip, in the last phase, insecurity is the factor why people could avoid public transport. According to Beecroft (2019), being able to reduce the fear of crime may increase public transport patronage between 3% and 10% depending on the demand.

Solutions using Big Data

Nowadays, there are large volumes of data to work with, due to the high explosion of available information. Yap & Munizaga (2018) point out that Big Data in public transportation has attracted an increasing number of studies, which is an opportunity to overcome public transport challenges and encourage its use.

Improve planning with passive data gathering

To solve the three problems associated with travel surveys and, therefore, obtain more accurate information for planning, a change in the current survey method is necessary. However, while using Big Data could lead to high potential information, the scientific community agrees that traditional methods are also needed and should not be replaced (Bonnel & Munizaga, 2018).

One of the ways of solving the problem of survey participation levels is using new passive and active methods of data gathering that could increase the number of participants. However, passive data gathering is the one method that takes advantage of the potential of Big Data. One of these techniques is to use data from smart cards, which register the information of the cardholders. Espinoza, Munizaga, Bustos, & Trépanier (2018) used this information to measure how much passengers change their travel behavior through time, which would be non-cost-effective using traditional surveys. Another method works utilizing mobile data, which has been used to estimate the Origin-Destination matrix in the Rhône-Alpes region in France (Bonnel et al., 2018) and in Santiago de Chile (Graells-Garrido, Peredo, & García, 2016). It is worth mentioning that these methods also solve the representativity problem and improve data quality (Stopher Peter R., 2009).

In regards to the non-response and immobility, Lucas & Madre (2018) recommend combining the traditional survey methods with the new passive ones, especially in the under-represented groups. This recommendation is in line with Bonnel & Munizaga (2018), who says that Big Data should not replace traditional methods because it also collects other information, such as socio-demographic data. Another data source that compliments the above mentioned is social media data, which was used by Ampt & Ruiz Sánchez (2018) to improve travel data.

Improving reliability, security, and communications with Big Data

While solving problems related to planning have a significant impact on people’s behavior towards public transport, if the operation is not adequate, people will still choose private vehicles. In this section, some ways to solve operation problems using Big Data are presented.

Information

When passengers have available information about public transport, the travel experience improves because having accurate and instant information produces better journey planning. This is exposed in Stone & Aravopoulou (2018) work, where he analyzed the impact of opening data in London public transportation. The local government body responsible for the transport system in the city made available the data about their customers through application program interface (such as live arrivals timetable and network performance) for commercial and non-commercial uses, improving the passenger experience.

Reliability

Big Data can be used to solve delays in the operation, helping to identify the factors affecting reliability. According to Cerreto, Nielsen, Nielsen, & Harrod (2018), there are two types of methods used in transport operation analysis, traditional statistical methods, and Big Data techniques. The first one summarizes the information to create a big picture. On the other hand, Big Data can be used to investigate recurring patterns. In his work, he used k-means clustering to identify recurrent delay patterns, which also helps the identification of the causes of individual patterns and the posterior countermeasure.

Security

Due to the increasing connectivity in public transport, security in one of the main people’s concerns, and it is also one of the variables that they considerer before using the system. According to Gardner, Cui, & Coiacetto (2017), in front of a harassment environment, women tend to value more campaigns to raise awareness about sexual harassment and encouraging reporting than circuit television (which is recognized to be ineffective in reducing fear of crime). Therefore, to increase the sensation of security on public transport, both are needed, improve communication channels, and probe the effectiveness of CCTV. Xu, Hu, & Mei (2016) describe a model called Video Structural Description (VSD), which aims at parsing video content into text information, facilitating the finding of useful information, and consequently improving the effectiveness of CCTV.

Implementation Challenges

While Big Data in public transport has shown only benefits, implementation has two main challenges, privacy, and coordination. Problems with privacy appear in the planning phase, when the transport regulator is collecting information. For example, using Big Data techniques do not provide socio-demographic details in most cases due to privacy concerns or because it is not available (Bonnel & Munizaga, 2018). Therefore, traditional surveys are still needed to understand mobility behavior better. On the other hand, there are specific problems with mobile data, such as data accuracy. Graells-Garrido et al. (2016) say that the geolocation using AVP (a method to geolocate mobile devices) may suffer from non-negligible error in almost all cases.

The second challenge of Big Data implementation is the coordination between the transport entities. In the previous section, the case of London was presented, where the same institution operates all the transport; therefore, it was not difficult to agree on objectives and definitions. However, in other countries, there are several transport operators in the same city and solving institutional challenges tend to be more difficult. According to Yap & Munizaga (2018), this occurs because there is not alignment between private and public entities when coordination and cooperation are needed.

Conclusion

With the growing density in cities and consequently increasing traffic, public transportation appears to be the solution to mobility problems. Having a functional and efficient public transport makes people switch from private transportation to the public one. However, the improvement in public transport has limited action areas; people that are more extreme in the use of their vehicles are not going to change to public transport easily. Therefore, the impact of the improvement only affects people that are using cars by necessity and not by choice.

Overcoming people’s barriers to public transport will enable more people to prefer it. Some methods to solve some problems using Big Data were presented. In transport planning, passive data collection methods help to understand the mobility of passengers, and combined with traditional methods; it is possible to gain more insights. For the operation, ways to improve security, reliability, and communications were shown, in particular those who use CCTV imagine recognition, clustering for delay patterns, and application program interface.

However, despite the number of techniques and methods developed, the applications of Big Data in public transport are still limited (Yap & Munizaga, 2018) due to the difficulties related to data privacy and coordination between transport entities. Therefore, transport operators must stop the mentality of being in a competitive market, and start to work together with public transport agencies in solving data privacy issues according to international standards.

References

Ampt, L., & Ruiz Sánchez, T. (2018). Workshop synthesis: Use of social media, social networks and qualitative approaches as innovative ways to collect and enrich travel data. 32. https://doi.org/10.1016/j.trpro.2018.10.016

Beecroft, M. (2019). The future security of travel by public transport: A review of evidence. Research in Transportation Business & Management, 100388. https://doi.org/https://doi.org/10.1016/j.rtbm.2019.100388

Beirão, G., & Cabral, J. A. S. (2007). Understanding attitudes towards public transport and private car: A qualitative study. Transport Policy, 14(6), 478–489. https://doi.org/https://doi.org/10.1016/j.tranpol.2007.04.009

Bonnel, P., Fekih, M., & Smoreda, Z. (2018). Origin-Destination estimation using mobile network probe data. Transportation Research Procedia, 32, 69–81. https://doi.org/https://doi.org/10.1016/j.trpro.2018.10.013

Bonnel, P., & Munizaga, M. A. (2018). Transport survey methods — in the era of big data facing new and old challenges. Transportation Research Procedia, 32, 1–15. https://doi.org/https://doi.org/10.1016/j.trpro.2018.10.001

Bradley, M., Greene, E., Spitz, G., Coogan, M., & McGuckin, N. (2018). The millennial question: Changes in travel behaviour or changes in survey behaviour? Transportation Research Procedia, 32, 291–300. https://doi.org/https://doi.org/10.1016/j.trpro.2018.10.053

Buehler, R. (2011). Determinants of transport mode choice: A comparison of Germany and the USA. Journal of Transport Geography, 19(4), 644–657. https://doi.org/https://doi.org/10.1016/j.jtrangeo.2010.07.005

Cerreto, F., Nielsen, B. F., Nielsen, O. A., & Harrod, S. S. (2018). Application of Data Clustering to Railway Delay Pattern Recognition. Journal of Advanced Transportation, 2018, 6164534. https://doi.org/10.1155/2018/6164534

Cohen, B. (2006). Urbanization in developing countries: Current trends, future projections, and key challenges for sustainability. Technology in Society, 28(1), 63–80. https://doi.org/https://doi.org/10.1016/j.techsoc.2005.10.005

Espinoza, C., Munizaga, M., Bustos, B., & Trépanier, M. (2018). Assessing the public transport travel behavior consistency from smart card data. Transportation Research Procedia, 32, 44–53. https://doi.org/https://doi.org/10.1016/j.trpro.2018.10.008

Gardner, N., Cui, J., & Coiacetto, E. (2017). Harassment on public transport and its impacts on women’s travel behaviour. Australian Planner, 54(1), 8–15. https://doi.org/10.1080/07293682.2017.1299189

Graells-Garrido, E., Peredo, O., & García, J. (2016). Sensing Urban Patterns with Antenna Mappings: The Case of Santiago, Chile. Sensors (Basel, Switzerland), 16(7), 1098. https://doi.org/10.3390/s16071098

Grotenhuis, J.-W., Wiegmans, B. W., & Rietveld, P. (2007). The desired quality of integrated multimodal travel information in public transport: Customer needs for time and effort savings. Transport Policy, 14(1), 27–38. https://doi.org/https://doi.org/10.1016/j.tranpol.2006.07.001

Hagman, O. (2003). Mobilizing meanings of mobility: Car users’ constructions of the goods and bads of car use. Transportation Research Part D: Transport and Environment, 8(1), 1–9. https://doi.org/https://doi.org/10.1016/S1361-9209(02)00014-7

Lucas, K., & Madre, J.-L. (2018). Workshop Synthesis: Dealing with immobility and survey non-response. Transportation Research Procedia, 32, 260–267. https://doi.org/https://doi.org/10.1016/j.trpro.2018.10.048

Poumanyvong, P., Kaneko, S., & Dhakal, S. (2012). Impacts of urbanization on national transport and road energy use: Evidence from low, middle and high income countries. Energy Policy, 46, 268–277. https://doi.org/https://doi.org/10.1016/j.enpol.2012.03.059

Schachtebeck, M. (2009). Delay Management in Public Transportation: Capacities, Robustness, and Integration. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.428.3919&rep=rep1&type=pdf

Stone, M., & Aravopoulou, E. (2018). Improving journeys by opening data: The case of Transport for London (TfL). The Bottom Line, 31, 00–00. https://doi.org/10.1108/BL-12-2017-0035

Stopher, P. R., & Greaves, S. P. (2007). Household travel surveys: Where are we going? Transportation Research Part A: Policy and Practice, 41(5), 367–381. https://doi.org/https://doi.org/10.1016/j.tra.2006.09.005

Stopher Peter R. (2009). The Travel Survey Toolkit: Where to From Here? In Patrick Bonnel, Martin Lee-Gosselin, Johanna Zmud, & Jean-Loup Madre (Eds.), Transport Survey Methods (pp. 15–46). https://doi.org/10.1108/9781848558458-002

United Nations, D. of E. and S. A. (2014). Population of Urban agglomerations with 300,000 Inhabitants or More. Retrieved from https://data.london.gov.uk/download/global-city-population-estimates/604a6a6f-2162-4d6b-bcd0-bee051703de1/global-city-population-estimates.xls

United Nations, D. of E. and S. A. (2019). World urbanization prospects: The 2018 revision. Retrieved from https://population.un.org/wup/Publications/Files/WUP2018-Report.pdf

Xu, Z., Hu, C., & Mei, L. (2016). Video structured description technology based intelligence analysis of surveillance videos for public security applications. Multimedia Tools and Applications, 75(19), 12155–12172. https://doi.org/10.1007/s11042-015-3112-5

Yap, M., & Munizaga, M. (2018). Workshop 8 report: Big data in the digital age and how it can benefit public transport users. Research in Transportation Economics, 69, 615–620. https://doi.org/https://doi.org/10.1016/j.retrec.2018.08.008

--

--