Optimized Student Boarding and Routing Areas for School Buses

Published in

AI4SM

24 min readOct 25, 2023

By Yuming Li, Haocheng Yang and Huilin Xu as part of course project of ECE1724H: Bio-inspired Algorithms for Smart Mobility. Dr. Alaa Khamis, University of Toronto, 2023.

The School Bus Routing Problem (SBRP) optimizes student transportation from residence to school. This study breaks down the SBRP into data preparation, bus stop selection and route generation. We employe metaheuristic algorithms like Genetic Algorithm (GA) and Ant Colony Optimization (ACO) in the article to improve transportation efficiency, student safety, and urban planning for SBRP.

Introduction

Scholars have studied the complex districting problem/school bus routing problem (SBRP) for nearly 50 years. It has a tangible impact on numerous families and school administrators globally, rather than being merely theoretical. This article provides a comprehensive analysis of the SBRP and implements our algorithms. The algorithms we choose is a combination of what we learned in the class and the research related work: the genetic algorithm, the ant colony optimization algorithm and simulated annealing. The paper encompasses Problem Characterization, Formulation and Modelling, Related Work, Dataset Preparation, and ESDA.

Problem Characterization

The School Bus Routing Problem (SBRP) is a variant of the vehicle routing problem that involves designing and optimizing school bus routes to safely and efficiently transport students between their residencies and schools. Each bus must meet various restrictions when picking up students from their bus stops and delivering them to the school division.

In the context of this article, our exploration of the SBRP centers on an urban environment, involving multiple schools and buses. The mixed loads are not allowed, which indicates a single load problem where each school serves as the departure point. The fleet consists of homogeneous buses, each with identical capacity. Each student is strictly allocated to one bus route, with buses visiting each stop only once.

Typically, the SBRP is segmented into five distinct sub-problems: Data Preparation, Bus Stop Selection, Bus Route Generation, School Bell Time Adjustment and Route Scheduling [1]. For the model research of this study, we focus on two pivotal sub-problems: Bus Stop Selection and Bus Route Generation. Data Preparation will be included in dataset section.

Bus Stop Selection is a foundational phase to the SBRP setting the stage for the subsequent routing process. In this article, we select the location-allocation-routing (LAR) approach, that is, first determines a set of bus stops for a school and assigns students to these stops. Urban density necessitates using pre-existing bus stops instead of student homes in routing. The primary objective of this phase is ensuring that students’ walking distances (SWD) to these stops are minimized.

Bus Route Generation designing the most efficient routes for buses to follow once stops are set and students assigned. Since the SBRP we discuss in this article involve multiple schools and buses, we adopt “cluster-first, route-second” approach that groups the students into district so that each district can own its route satisfying the constraints that exists. The hard constraints encompass bus capacity limits (C), and maximum allowable ride durations (MRT). The primary object of this phase is the minimization of the total distance (or time) traveled (TBD) by the fleet. The minimization of number of school bus may also be included.

In conclusion, the SBRP is crucial to student safety, resource efficiency, and city travel. It also matters in urban planning. The most challenging part about the SBRP is that it has complex and varied situations and multi-objective, aiming to minimize both the total route distance and the number of buses required while ensuring it satisfies diverse constraints. Moreover, the route generation problem is a NP-hard problem, making the pursuit of optimal solutions a terrifying task.

Problem Formulation and Modeling

Mathematical Formulation

Building on the initial section of our article, we focus on bus stop selection and bus route generation from a specific dataset (the bus stops in a certain area located in the city of Toronto). For bus stop selection, we favor the K-Nearest Neighbors method to optimize SWD. With respect to bus routes generation and the plan of requiring a certain number of buses that satisfy the C constraints or the LB (load balancing) optimization, we plan to use metaheuristics methods, specifically population-based evolutionary approaches like genetic algorithm, ant colony and trajectory-based approaches like SA (simulated annealing). We will later use mathematical formulations to quantify the performance of these different approaches and select the best ones.

Below is a list of mathematical formulations that will be used in the future in our project. These formulations are partially based on these literatures: [2], [3]

Sets:

Parameters:

Variables:

Measures:

Model:

Given the conditions mentioned above, we can formulate the following multi-objective nonlinear integer program describing our design problem:

Constraint (1) means that for each student unit, one and only one bus stop is assigned. Constraint (2) means that the number of student units of a particular bus route cannot exceed the bus capacity, and here we make the assumption that a bus can only serve one school, although our problem scenario is multi-school. We will refine this constraint later when our algorithm is complete. Constraint (3) means that for a certain point on a certain route, there are not going to be more than one path for the school bus to go, there is only one next stop. Constraint (4) means that for one certain bus route, there will not be two bus stops that have the same next stop, in other words, the route will not form a ring. Constraints (5), (6) and (7) prevent the formation of subtours using flow variables. Constraints (8), (9) and (10) are integrality restrictions. Constraint (11) is to quantify the safety of a school bus route, we will calculate of a route by adding the traffic accidents every year for every block that is on the route. Constraint (12) is to satisfy the time window of schools if the school is on certain route of school buses.

To provide a practical solution of the SBRP of our chosen area, we have made the following assumptions: 1. We assume that every bus has the same capacity. 2. We aggregate students in a residential area to be a unit, and we will use unit to measure the loading capacity of a school bus, in other words, instead of 40 students, we may use 4 units of students in the sense that each residential area has 10 students. 3. We assume that all school buses start up at the same time.

Possible Modeling Approach

1. Assignment Phase — Bus Stop Selection

In the assignment phase, the key is to determine the optimal locations for bus stops such that students are able to have the least walking distance. One potential technique for this phase is the clustering algorithms, such as K-Nearest Neighbors (KNN). The KNN algorithm, in particular, can assign a student to the nearest cluster center (bus stop), thereby minimizing the walking distance.

2. Routing Phase — Bus Route Generation

After identifying bus stops, the next step in the School Bus Routing Problem (SBRP) is to determine the most optimal bus routes. Metaheuristic algorithms are effective for this purpose due to their global search strategy and ability to find near-optimal solutions efficiently. Here we select two mostly used metaheuristics algorithms by 2019 for this task [4]: Genetic Algorithms (GA) and Ant Colony Optimization (ACO).

a. Genetic Algorithm (GA)

Genetic Algorithms are adaptive heuristic search algorithms based on the principles of natural selection and genetics. They evolve a population of candidate solutions through five phases: initial population, fitness function, selection, crossover, and mutation.

GA in the context of SBRP:

Chromosomes: Represent possible bus route solutions.
Fitness Function: Evaluates the fitness of each chromosome based on specific objective functions.
Selection: Routes with higher fitness are more likely to be selected for the next generation.
Crossover: Merges features of two parent routes to create a new route.
Mutation: Introduces minor changes to routes to maintain diversity and prevent premature convergence.

b. Ant Colony Optimization (ACO)

Ant Colony Optimization mimics the foraging behavior of ants. The fundamental idea is that ants leave pheromone trails on their paths, influencing the path choices of subsequent ants.

ACO in the context of SBRP:

Pheromone Trail: Ants (solutions) deposit pheromones on routes; better routes receive more pheromones.
Pheromone Update: The concentration of pheromones increases as more ants follow a path, making it more attractive.
Evaporation: Prevents premature convergence to suboptimal solutions by allowing pheromones to dissipate over time.
Solution Construction: Ants build new routes based on pheromone concentration and heuristic information.

Related work

Recent research in bus route generation problem solving has diversified into four main categories: classical heuristics, metaheuristics, exact methods, and uncertainty methods [4].

Initially, SBRP primarily employed exact methods like mixed integer programming (MIP) or nonlinear mixed integer programming (NLMIP). Post-2010, there’s been a shift towards metaheuristics due to their balance between solution quality and computational efficiency, particularly useful for larger datasets. In contrast, exact methods, though optimal, struggle with increased computational demands in complex scenarios.

For the related work, we delve deep into three specific metaheuristics combined with this course: GA, ACO, and SA. Classifying these algorithms based on the lectures from our course, they fall into Evolutionary Computing Algorithms, Swarm Intelligence, and Stochastic Search respectively.

Genetic Algorithm (GA)

As one of the most common-used metaheuristic algorithms for SBRP, GA’s strength lies in handling multiple route solutions simultaneously. Samuel A. et al. used GA to initiate and refine route solutions (chromosomes) based on specific objectives, illustrating natural selection principles [5]. Moohong Kang et al. applied GA for bus allocation to student groups and route ordering for buses [6]. Furthermore, incorporating diverse genetic operators into GA has been proven to improve its effectiveness, as indicated by Sayda Ben Sghaier [7].

Ant Colony Optimization (ACO)

ACO’s ability to replicate the dynamic pheromone trail system making it ideal for SBRP. Juan S. [8] applied ACO to SBRP, analyzing changing pheromone trajectories to determine the optimal student pickup and drop-off sequence. Lingmei Huo et al. [9] demonstrated ACO’s stability in SBRP, noting minimal variance between the best and worst solutions under identical parameters.

Simulated Annealing (SA)

SA is advantageous for SBRP due to its probabilistic approach and its flexibility for adaptation. Denis M. et al. [10] advocate using SA to generate a broader spectrum of optimal solutions, thereby offering decision-makers more options. Similarly, Xiaopan Chen et al. [11] employed SA to identify the shortest routes using a lexicographic function on benchmark datasets.

Exploratory Spatial Data Analysis (ESDA):

Dataset Selection:

For the selection of the dataset, we used overpass to obtain the required three sets of data, residential, school and bus stops. Specific reasons for choosing a smaller range at this time include the fact that we need to take into account the time window in which students go to school, so we have temporarily narrowed down the range to be on the safe side to ensure that the number of students is small. If the algorithm performs well in subsequent results, we will gradually expand the range to be considered.

overpass_url = "https://overpass-api.de/api/interpreter"

api = overpass.API()

# Find the data of residence
residential_query = """
[out:json][timeout:50];
area[place=city]["name:en"="Toronto"]->.searchArea;
(
 node["landuse"="residential"](area.searchArea)(around:2500,43.7695, -79.4061);
 way["landuse"="residential"](area.searchArea)(around:2500,43.7695, -79.4061);
 relation["landuse"="residential"](area.searchArea)(around:2500,43.7695, -79.4061);
);
out center;
"""

residence_response = requests.get(overpass_url, params={"data": residential_query})

if residence_response.status_code == 200:
    residential_data = residence_response.json()
else:
    print("Failed:", residence_response.status_code)

# Find the data fo school
school_query = """
[out:json][timeout:50];
area[place=city]["name:en"="Toronto"]->.searchArea;
map_to_area;
(
  node["amenity"="school"](around:1500,43.7695, -79.4061);
  way["amenity"="school"](around:1500,43.7695, -79.4061);
  rel["amenity"="school"](around:1500,43.7695, -79.4061);
);
out center;

"""

school_response = requests.get(overpass_url, params={"data": school_query})
if school_response.status_code == 200:
    schools_data = school_response.json()

# Find the data of Bus Stop
bus_stop_query = """
[out:json][timeout:50];
area[place=city]["name:en"="Toronto"]->.searchArea;
map_to_area;
(
area[place=city]["name:en"="Toronto"]->.searchArea;
node["highway"="bus_stop"](area.searchArea)(around:2500,43.7695, -79.4061);
way["highway"="bus_stop"](area.searchArea)(around:2500,43.7695, -79.4061);
relation["highway"="bus_stop"](area.searchArea)(around:2500,43.7695, -79.4061);
);
out center;
"""

bus_stop_response = requests.get(overpass_url, params={"data": bus_stop_query})
if bus_stop_response.status_code == 200:
    bus_stop_data = bus_stop_response.json()

To visualize the data better, we plotted the following graph using ploty:

Map of Residences, Schools And Bus Stops

Use KNN to assign bus stops and schools

In order to visualize the relationship between students and bus stops more intuitively, we chose to use KNN to assign students to the bus stops closest to their homes. KNN as supervised learning for classification works perfectly for this purpose. In the following graph, each student and the bus stop to which he or she is assigned is represented by a connecting line. And we can see that most of the bus stops are assigned to more than one student.

from sklearn.neighbors import NearestNeighbors

# We used knn to find the nearest bus stop to each residence
knn = NearestNeighbors(n_neighbors=1)
knn.fit(bus_stop_coords[['Latitude', 'Longitude']])
num, num1 = knn.kneighbors(residential_half[['Latitude', 'Longitude']])

# Complete residential_df dataframe
residential_half['Nearest_Bus_Stop'] = num1.flatten()
residential_half['Nearest_Bus_Stop_Lat'] = residential_half['Nearest_Bus_Stop'].apply(lambda x: bus_stop_coords.loc[x, 'Latitude'])
residential_half['Nearest_Bus_Stop_Lon'] = residential_half['Nearest_Bus_Stop'].apply(lambda x: bus_stop_coords.loc[x, 'Longitude'])

import plotly.graph_objects as obj

# Picture showing the relationship between residential and bus stop allocations
fig = obj.Figure()

marker1 = obj.scattermapbox.Marker(size=10, color='blue')
fig.add_trace(obj.Scattermapbox(
    lat=residential_half['Latitude'],
    lon=residential_half['Longitude'],
    mode='markers',
    marker=marker1,
    name='residence'
))

marker2 = obj.scattermapbox.Marker(size=10, color='red')
fig.add_trace(obj.Scattermapbox(
    lat=residential_half['Nearest_Bus_Stop_Lat'],
    lon=residential_half['Nearest_Bus_Stop_Lon'],
    mode='markers',
    marker=marker2,
    name='Bus Stops'
))

for i, j in residential_half.iterrows():
    fig.add_trace(obj.Scattermapbox(
        lat=[j['Latitude'], j['Nearest_Bus_Stop_Lat']],
        lon=[j['Longitude'], j['Nearest_Bus_Stop_Lon']],
        mode='lines',
        line=dict(width=1, color='gray')
    ))

lat1 = residential_half['Latitude'].mean()
lon1 = residential_half['Longitude'].mean()
fig.update_layout(
    mapbox_style="stamen-terrain",
    mapbox_zoom=10,
    mapbox_center={"lat": lat1, "lon": lon1},
    margin={"r":0,"t":0,"l":0,"b":0}
)

fig.show()

Relationship Between Student And Assigned Bus Stops

Similarly, we used KNN to assign a school to each student, due to the fact that we do not have access to each student’s attendance information. To ensure data integrity and rigor, we chose the school closest to home as the student’s school. In the next algorithm section, we will again consider the case where a student is not enrolled in the nearest school.

Relationship Between Student And Assigned School

Bubble Map: In order to better represent the number of students in a given school, we have created a bubble map, which will be very helpful when it comes to how to assign school buses, and how many buses to assign. As can be seen from the bubble chart of the number of student units assigned to each bus stop, by using the knn algorithm, we have achieved a relatively fair distribution.

Bubble Map For Number of Students In a Particular School

Hexagonal binning map: The hexagonal binning map of schools and chosen bus stops show that in the southwest corner of our interested area more routing points are located. If we want to district these areas, we will need higher granularity than the northeast part of our intetested area.

import plotly.figure_factory as ff
# use create_hexbin_mapbox to create Hexbin Mapbox
fig = ff.create_hexbin_mapbox(
    data_frame=result,
    lat=result['Latitude'],
    lon=result['Longitude'],
    nx_hexagon=10,
    opacity=0.5,
    min_count=1,
    labels={'color': 'interested points'},
    color_continuous_scale='turbo'
)

fig.update_layout(
    mapbox_style="stamen-terrain",
    margin={"r": 0, "t": 0, "l": 0, "b": 0},
    mapbox={'center': {'lat': center[0], 'lon': center[1]}, 'zoom': 10}
)
fig.show()

Heat Map: Finally, as we can see from the residential heat map, there are more homes in the southern part of our area, but schools and bus stops are evenly distributed, so it is better for us to plan the route so that it starts in the south and ends in the north. We can further examine the results generated by the algorithm to check if the routes have the above pattern.

import plotly.express as px
center = (43.7695, -79.4061)

# use density_mapbox to create density_mapbox
fig = px.density_mapbox(
    data_frame=residential_half,
    lat=residential_half['Latitude'],
    lon=residential_half['Longitude'],
    radius=5,
)

fig.update_layout(
    mapbox_style="stamen-terrain",
    margin={"r": 0, "t": 0, "l": 0, "b": 0},
    mapbox={'center': {'lat': center[0], 'lon': center[1]}, 'zoom': 10}
)
fig.show()

After doing all the data processing, we summarized all the information we needed. We no longer needed residential areas and bus stops that did not have students waiting, so we deleted all of that data. Now that all the data has been processed, we can move on to the algorithm part.

Proposed Solution

1. Simulated Annealing

The simulated annealing method uses a chance-based approach to the complex school bus routing problem to find the most efficient overall path. At first, the Simulated Annealing Method conducts an extensive search at high temperatures, exploring a variety of route configurations, even those that initially appear less efficient. As the method cools down, it begins to narrow its search, targeting the most promising solutions. After rounds of adjustments and fine-tuning, the ultimate goal is to develop a bus route that subtly reduces travel time or distance, getting as close as possible to the ideal arrangement.

def run_sa_for_school(stop_points):
    distance_matrix = eval_distances_from_stops(stop_points)
    gta_part_tsp = TSP(dists=distance_matrix, gen_method='random_swap')
    sa = SimulatedAnnealing(max_iter=300, max_iter_per_temp=100, initial_temp=85, final_temp=0.0001, cooling_schedule='linear', cooling_alpha=0.9)
    sa.init_annealing(gta_part_tsp)

    start = time.time()
    sa.run(gta_part_tsp, repetition=10)
    end = time.time()

    best_order = sa.s_best
    return best_order, end - start

SA Bus Routing Result for North York Schools

2. Particle Swarm Optimization

Using PSO to solve a binary optimization problem such as school bus routing requires first calculating the distances between all stations using the haversine formula and generating a distance matrix. In addition to the parameters of the algorithm itself, we set the number of iterations and let the PSO algorithm update the positions of the particles (candidate solutions) over multiple iterations to search for the optimal solution.

distance_matrix = defaultdict(dict)
    for ka, va in stop_point.items():
        for kb, vb in stop_point.items():
            distance_matrix[ka][kb] = 0.0 if kb == ka else haversine(va, vb)

distances = pd.DataFrame(distance_matrix)
    city_names = list(distances.columns)
    distance = distances.values

    options = {'w': 0.79, 'c1': 2.05, 'c2': 1.5, 'k': 10, 'p': 2}
    n_cities = len(city_names)
    optimizer = ps.discrete.BinaryPSO(n_particles=50, dimensions=n_cities, options=options)

    cost, solution = optimizer.optimize(tsp_cost, iters=150, verbose=True, distance=distance)

PSO Bus Routing Result for North York Schools

3. Genetic Algorithm

Genetic Algorithm is a heuristic optimization technique widely employed for solving combinatorial optimization problems, including the Traveling Salesman Problem (TSP). TSP aims to find the shortest path for a salesman to visit all cities exactly once. In the context of genetic algorithms, TSP is modeled as the permutation of a chromosome, where each gene represents the order of city visits. Initially, a set of random solutions forms the population, and through simulated evolution via crossover and mutation operations mimicking biological processes, the algorithm progressively refines the path. The fitness function evaluates the quality of each solution, typically represented by the reciprocal of the path length. The algorithm evolves iteratively through selection, crossover, and mutation, converging towards a solution that approximates the optimal TSP path.

# create the problem class
class TravelingSalesman(ElementwiseProblem):

    def __init__(self, cities, distances, **kwargs):
        self.cities = cities
        n_cities = len(cities)
        self.distances = distances

        super().__init__(
            n_var=n_cities,
            n_obj=1,
            xl=0,
            xu=n_cities,
            vtype=int,
            **kwargs
        )

    def _evaluate(self, x, out, *args, **kwargs):
        f = 0
        for i in range(len(x) - 1):
            f += distances[x[i], x[i + 1]]
        f += distances[x[-1], x[0]]
        out["F"] = f
  distance_matrix = create_distance_matrix(cities, haversine)
  # print(distance_matrix)
  distances = pd.DataFrame(distance_matrix)
  city_names=list(distances.columns)
  distances=distances.values
  problem = TravelingSalesman(cities,distance_matrix)

  algorithm = GA(
      pop_size=20,
      sampling=PermutationRandomSampling(),
      mutation=InversionMutation(),
      crossover=OrderCrossover(),
      repair=StartFromZeroRepair(),
      eliminate_duplicates=True
  )

  # set the termination condition: if the result does not improve in 300 generations, then terminate
  termination = DefaultSingleObjectiveTermination(period=300, n_max_gen=np.inf)

  # use pymoo to solve the problem
  start = time.time()
  res = minimize(
      problem,
      algorithm,
      termination,
      seed=1,
      verbose=False
  )

Genetic Bus Routing Result for North York Schools

4. Ant Colony

Ant Colony Algorithm is a heuristic optimization approach employed for addressing combinatorial optimization problems, such as the Traveling Salesman Problem (TSP). In this algorithm, artificial ants traverse the cities, depositing pheromones on their paths. The probability of choosing a particular path is influenced by the pheromone levels, with shorter paths accumulating more pheromones. Over iterations, the algorithm converges as ants tend to favor shorter routes due to higher pheromone concentrations. This process mimics the foraging behavior of real ants, and eventually, the algorithm identifies a path that approximates the optimal solution for the TSP.

ACO Bus Routing Result for North York Schools

5. SOM

Self-Organizing Maps, a type of unsupervised neural network, are employed in the School Bus Routing Problem to effectively cluster bus stop points. By representing each bus stop points belonging to a specific school as a node in the input space, the SOM algorithm organizes these nodes into a two-dimensional grid that approximates the topological structure of the original data. This process groups nearby pick-up points, facilitating the creation of efficient bus routes. Post-clustering, route optimization algorithms can be applied to each cluster, considering constraints such as bus capacity. This methodology enhances route efficiency by minimizing total travel distance and time, thereby optimizing resource utilization and operational costs in school bus routing.

  N_neurons = stop_count*2
  iteration = 2000

  # Create a self-organizing map with 1xN_neurons grid
  som = MiniSom(1, N_neurons, 2, sigma=10, learning_rate=.5,
                neighborhood_function='gaussian', random_seed=50)

  som.random_weights_init(stop_points)
  som.train(stop_points, iteration, verbose=False, random_order=False)

SOM Bus Routing Result for North York Schools

Experiments and Analysis

Quantitative Analysis

1.1 Comparison of path length

We use the answer of the Concorde TSP solver as a benchmark to compare with the other 5 algorithms.Below is a picture of the path lengths of all the algorithms compared with Concorde TSP.

1.2 Comparison of time spent to generate the path

Discovery: From the graph above, we know that the SA algorithm is the most time consuming, SOM is the fastest algorithm, the length of path of SOM is also the shortest, and the longest path was generated by PSO. Therefore, the path length is negatively related to the time spent to generate this path, no matter what kind of algorithm we are using.

If we assume that there are two algorithms, algorithm 1 generates a path with length of 1, spending 1 second, algorithm 2 generates a path with length of 2, spending 0.5 second, and we want to view these two algorithms as equally efficient, then we can define the efficiency of one algorithm as below (equation 1):

where t is the time spent to generate the path, l is the length of the path.

Based on the definition above, we can draw another curve reflecting the efficiency of each algorithm. SOM is the most efficient using this method to calculate the efficiency, followed by ACO, then PSO, GA, SA. However, we know that the length of the path can never reach zero, so our method above may be over-exaggerating the importance of time spent in evaluating the performance of an algorithm, leading to SOM and ACO taking the lead.

We can modify our method by choosing an algorithm that takes the minimum amount of time, but generate the longest path, and for every other algorithm, we calculate the deduction of path length per time unit spent. If we are to calculate the efficiency like this, some algorithms will have a efficiency that is negative, in this particular case, we have the efficiency of PSO a negative number.

Because ACO and SOM uses less time to generate a shorter path than PSO, so the efficiencies of them are negative. In this graph, algorithms with negative efficiency should be ranked before algorithms with positive efficiency.

If we have the way to estimate or to calculate the optimal path length in this particular problem, we can also use the formula below to calculate the efficiency of one algorithm should we care little about the time spent and want to minimize the path length, (equation 2) where lo is the optimal path length.

In this problem specifically, if we let lo take the value of the path length generated by SOM, we will have the graph below.

** Note that in the graph above, SOM’s efficiency is not really 0, instead SOM has the highest efficiency that is almost positive infinity, but in order to provide a good comparison between other algorithms, we set the efficiency of SOM to 0.

2. Qualitative analysis

When evaluating the efficacy of different algorithms for bus route planning, we observe that the majority of bus stops are grouped along busy thoroughfares. This gives the algorithms an optimal chance of locating efficient routes through congested urban traffic networks.

PSO Algorithm : As can be seen from the graph, there are some problems with the routes generated by the PSO algorithm. It tends to create paths with large spans where they are not necessary, which leads to poor results in finding the shortest path. Therefore the routes have room for improvement in terms of logical order and smoothness, such as avoiding unnecessary detours or large detours.

ACO, SOM algorithms: although the ACO and SOM algorithms show better performance than PSO, there is still room for improvement on some routes, e.g. the planning of the pink route for ACO and the purple route generated by the SOM algorithm are somewhat counter-intuitive

Genetic Algorithm and SA Algorithm: these two algorithms showed optimal performance in terms of logical ordering and path smoothness. They did not choose any turnback routes and also avoided sharp corners, which is an efficient treatment for path planning problems.

Adaptation & Tuning

Simulated Annealing

Experimental Goal: Determine which parameters in the SA algorithm have a significant effect (good or bad activity) on optimizing school bus routes and adjust these parameters to improve algorithm performance.

Initial parameters: max_iter=100, max_iter_per_temp=100, initial_temp=85, final_temp=0.0001, repetition=10.

Experimental Procedure:
Step 1: Increase the number of repetitions
Increase repetition from 10 to 50.
Run the SA algorithm five times and take the average result to record.
Step 2: Increase the maximum number of iterations
Increase max_iter from 100 to 300.
Keeping other parameters unchanged, re-run the algorithm again for five times and record the average results.
Step 3: Adjust final_temp
Modify final_temp from 0.0001 to 10.
Run the algorithm and record the results

From these results we can see that step 2 (increasing max_iter) produced the best results, producing the shortest total distance traveled. Compared to the original configuration, the total distance traveled for “Step 2” decreased from 41.8368 to 40.5750, a decrease of about 3.02%. This is the only decrease in total distance observed in all the steps, while all the other steps showed an increase, especially “Step 1” where the total distance traveled increased by about 3.14%. In terms of total time, “Step 2” and “Step 3” showed a decrease of 28.74% and 28.87%, respectively, compared to the original setup, whereas “Step 1” showed a significant increase of about 396.77%. Therefore, if total distance and total time are used as performance indicators, the adjustments of “Step 2” is clearly superior, which shows improvements in both indicators. So in our code, I increased the max_iter to 400 to get the best results.

2. ACO

ACO is one of the most complicated algorithms among the algorithms that we choose, in the sense that it has the largest number of parameters to be tuned. Although we can generate a relatively good result without tunning compared to the other algorithms, there is still space to be optimized.

Tuning method

We plan to generate a grid of parameters, to each parameter we assign different granularities. For example, we increase the number of ants from 10 to 100, with the pace of 10. The parameters and the pace are as this table below.

We first record our result before tuning. Under the parameters set that alpha = 1, beta = 1, num_ants = 10, max_iter = 100, n_best = 1, decay = 0.95. The path length is 36.597852 while the running time is 6.672899s.

Then we use loops to explore every combination of parameters in the table above, with the objective to find the parameters that gives the shortest path length without consuming much time. As defined by the scoring system in the last section. Note that because the results are random, we need to take the average to reflect the efficiency of one combination of parameters.

According to our adaptation of parameters, the combination of alpha = 1, beta = 2, num_ants = 10, max_iter = 100, n_best = 3, decay = 0.9 gives us the best result on average, in the sense of path length and running time, with path length = 35.943484 and running time = 6.684948s.

Below is the visualization of the path generated using this set of parameters.

Discoveries

Our results have shown that if we want to minimize our path length, using bigger beta is the first thing to do. Because bigger beta gives us a chance to explore, rather than relying on the pheromone, which may lead to wrong path that is accidentally passed by one of the ants.
Number of ants is not significantly useful in our case, changing this parameter may make the algorithm slower without any improvement.
max_iter is important, if we want to make the result optimal and stable, we should try to increase this parameter.
n_best and decay are the most random parameters, we cannot find obvious relevance between these two parameters and the result.

3. SOM

As a machine learning algorithm, SOM has great advantages in both running time and total path length. Tuning parameters can make its advantages even greater.

Tuning method

We plan to adjust the parameters of learning rate and sigma for MiniSom model. The tuning values of learning rate is 0.3/0.5/0.7, and the tuning values of sigma is 1/5/10.
Learning Rate (learning_rate): This controls how much the weights are adjusted during each iteration. A higher learning rate leads to faster learning but can overshoot optimal weights.

Sigma: This parameter influences the size of the neighborhood in the SOM. A larger sigma means a broader neighborhood. Initially, a larger sigma is helpful to capture the global structure of the data, and then it can be reduced to refine the map.

Results

As can be seen from the figure above, generally, the smaller the learning rate, the shorter the path; When sigma is 5, the effect is better. In all combinations, when sigma is 5 and the learning rate is 0.3, the total path length is the smallest.

Conclusion

In order to find the optimal path for school buses to pick up and drop off students, we used a total of five approaches to this problem: SA, PSO, GA, ACO, and SOM.GA and ACO offer significant advantages in terms of path searching, augmentation mechanisms, and overcoming the local optimization problem, and thus were assumed to be the best possible strategies at the beginning of the project. However the final results tell us that the SOM algorithm performs best in any of these aspects. This is because the SOM algorithm adjusts the neuron weights by constantly self-organizing itself during the iteration process, which effectively maps the school stops into a low-dimensional space and preserves the topological nature of the geographic locations in the process. This property makes SOM particularly applicable in this scenario as it shows high efficiency in performing multidimensional data analysis, especially in spatial data integration and processing.

There are still many things that can be optimized in our approach, for example, the shortest route is not necessarily the fastest one, it may have more “stop” signs or more traffic lights because it is in a neighborhood, so for the sake of rigor a more reasonable approach would be to set a weight for each parameter that may affect the final result. Therefore, for the sake of rigor, a more reasonable approach would be to set a weight for each parameter that may affect the final result and try to get a more accurate result. In addition to this, we did not take into account the time window of the school, i.e., whether the students could arrive at school on time. Therefore a school bus that picks up too many or too far away students and makes them miss the start of the school day is equally unreasonable, which is another important indicator of whether the bus is qualified or not. In addition, we can increase the size of neighborhoods, increase the number of students, create temporary road closures, and other conditions to complicate road conditions and make the scenarios more realistic. We will continue to optimize these scenarios above in the future to create better algorithms and models.

Source Code:

Google Colaboratory

Edit description

colab.research.google.com

References

[1] Park, J. and Kim, B.-I. (2010). The school bus routing problem: A review. European Journal of Operational Research, 202(2), pp.311–319.

[2] Schittekat, Patrick, Marc Sevaux, and Kenneth Sorensen. “A mathematical formulation for a school bus routing problem.” 2006 international conference on service systems and service management. Vol. 2. IEEE, 2006.

[3] Bowerman, Robert, Brent Hall, and Paul Calamai. “A multi-objective optimization approach to urban school bus routing: Formulation and solution method.” Transportation Research Part A: Policy and Practice 29.2 (1995): 107–123.

[4] Ellegood, W.A., Solomon, S., North, J. and Campbell, J.F. (2019). School bus routing problem: Contemporary trends and research directions. Omega, p.102056

[5] Oluwadare, S.A., Oguntuyi, I.P. and Nwaiwu, J.C. (2018). Solving School Bus Routing Problem using Genetic Algorithm-based Model. International Journal of Intelligent Systems and Applications, 10(3), pp.50–58.

[6] Kang, M., Kim, S.-K., Felan, J.T., Choi, H.R. and Cho, M. (2015). Development of a Genetic Algorithm for the School Bus Routing Problem. International Journal of Software Engineering and Its Applications, 9(5), pp.107–126.

[7] Sayda ben Sghaier, Najeh Ben Guedria and Rafaa Mraihi (2013). Solving School Bus Routing Problem with genetic algorithm.

[8] Arias-Rojas, J.S., José Mondéjar Jiménez and Montoya‐Torres, J.R. (2012). Solving of school bus routing problem by ant colony optimization. Revista EIA, 9(17), pp.193–208.

[9] Huo, L., Yan, G., Fan, B., Wang, H. and Gao Wei-tao (2014). School bus routing problem based on ant colony optimization algorithm.

[10] Manumbu, D.M., Mujuni, E., & Kuznetsov, D. (2014). A Simulated Annealing Algorithm for Solving the School Bus Routing Problem: A Case Study of Dar es Salaam. Computer Engineering and Intelligent Systems, 5, 44–58.

[11] Chen, X., Kong, Y., Dang, L., Hou, Y. and Ye, X. (2015). Exact and Metaheuristic Approaches for a Bi-Objective School Bus Scheduling Problem. PLOS ONE, [online] 10(7), p.e0132600.