Using SingleStore as a Geospatial Database

16 min readOct 26, 2021

Abstract

SingleStore is a multi-model database system. In addition to Relational data, it supports Key-Value, JSON, Full-Text Search, Geospatial and Time Series. A previous article showed SingleStore’s ability to manage Time Series data.

In this article, we’ll explore Geospatial data. We’ll use data for London Boroughs and data for the London Underground. We’ll use these datasets to perform a series of Geospatial queries to test SingleStore’s ability to work with Geospatial data. We’ll also discuss an extended example using the London Underground data for a practical use case: finding the shortest path between two points in a network. Finally, we’ll create London Underground visualisations using Folium and Streamlit.

The SQL scripts, Python code and notebook files used in this article are available on GitHub.

Introduction

In a previous article, we noted the problems of using Polyglot Persistence for managing diverse data and processing requirements. We also discussed how SingleStore would be an excellent solution for Time Series data, providing business and technical benefits. This article will focus on Geospatial data and how SingleStore can offer a unified approach to storing and querying Alphanumeric and Geospatial data.

To begin with, we need to create a free Managed Service account on the SingleStore website.

The data for London Boroughs can be downloaded from the London Datastore. The file we will use is statistical-gis-boundaries-london.zip. This file is 27.34 MB in size. We will need to perform some transformations on the data provided to use the data with SingleStore. We will discuss these transformations shortly.

The data for the London Underground can be obtained from Wikimedia. It is available in CSV format as Stations, Routes and Line Definitions. This dataset appears to be widely used but lags behind the latest developments on the London Underground. However, it is sufficient for our needs and could be easily updated in the future.

A version of the London Underground dataset can also be found on GitHub, with an extra column time added to Routes. This will help find the shortest path, which we will discuss later.

An updated set of the London Underground CSV files can be downloaded from the GitHub page for this article.

To summarise, we’ll use:

The zip file from the London Datastore.
The three London Underground CSV files from the GitHub page for this article.

London Boroughs Data

Convert London Boroughs Data

The zip file we downloaded needs to be unzipped. Inside, there will be two folders: ESRI and MapInfo. Inside the ESRI folder, we are interested in the files beginning with London_Borough_Excluding_MHW. There will be various file extensions, as shown in Figure 1.

We need to convert the data in these files to the Well-Known Text (WKT) format for SingleStore. To do this, we can follow the advice on the SingleStore website for Loading Geospatial Data into SingleStore.

The first step is to use the MyGeodata Converter tool. We can drag and drop files or browse files to convert, as shown in Figure 2.

All nine files highlighted in Figure 2 have been added, as shown in Figure 3. Next, we’ll click Continue.

On the next page, we need to check that the output format is WKT (1), that the Coordinate system is WGS 84 (2) and click Convert now! (3) as shown in Figure 4.

The conversion result can be downloaded, as shown in Figure 5.

This will download a zip file that contains a CSV file called London_Borough_Excluding_MHW.csv. This file contains a header row and 33 rows of data. One column will be called WKT and have 30 rows of POLYGON data, and there will be three rows of MULTIPOLYGON data. We need to convert the MULTIPOLYGON data to POLYGON data. We can do this very quickly using GeoPandas.

Create the London Boroughs Database Table

In our SingleStore Managed Service account, let’s use the SQL Editor to create a new database. Call this geo_db, as follows:

CREATE DATABASE IF NOT EXISTS geo_db;

We’ll also create a table, as follows:

USE geo_db;

DROP TABLE IF EXISTS london_boroughs;
CREATE ROWSTORE TABLE IF NOT EXISTS london_boroughs (
    name     VARCHAR(32),
    hectares FLOAT,
    geometry GEOGRAPHY,
    centroid GEOGRAPHYPOINT,
    INDEX(geometry)
);

SingleStore can store three main Geospatial types: Polygons, Paths and Points. In the above table, GEOGRAPHY can hold Polygon and Path data. GEOGRAPHYPOINT can hold Point data. In our example, the geometry column will contain the shape of each London Borough and centroid will contain the approximate central point of each borough. As shown above, we can store this Geospatial data alongside other data types, such as VARCHAR and FLOAT.

Data Loader for London Boroughs

Let’s now create a new notebook. We’ll call it data_loader_for_london_boroughs.

In a new code cell, let’s add the following code to import several libraries:

import pandas as pd
import geopandas as gpd
import warnings

from shapely import wkt

warnings.filterwarnings("ignore")

Now we’ll read our CSV file:

boroughs_csv_url = "https://raw.githubusercontent.com/VeryFatBoy/singlestore-geospatial-example/refs/heads/main/datasets/London_Borough_Excluding_MHW.csv"

boroughs_df = pd.read_csv(boroughs_csv_url)

We’ll rename and drop some of the columns:

boroughs_df["geometry"] = boroughs_df["WKT"].apply(wkt.loads)
boroughs_df.rename(columns = {"NAME": "name", "HECTARES": "hectares"}, inplace = True)
boroughs_df.drop(
    columns = ["WKT", "GSS_CODE", "NONLD_AREA", "ONS_INNER", "SUB_2009", "SUB_2006"],
    inplace = True
)

We need to convert rows with MULTIPOLYGON to POLYGON, so we’ll convert to a GeoDataFrame and use explode(), as follows:

boroughs_geo_df = gpd.GeoDataFrame(boroughs_df, geometry = "geometry")
boroughs_geo_df = boroughs_geo_df.explode(column = "geometry", index_parts = False)
boroughs_geo_df = boroughs_geo_df.set_crs("EPSG:4326", allow_override = True)

If we check the structure of the Dataframe we should not see any rows with MULTIPOLYGON.

We can plot a map of the London Boroughs, as follows:

map = boroughs_geo_df.plot(column = "hectares", cmap = "OrRd", legend = True)
map.set_axis_off()

This should render the image shown in Figure 7.

At this point, since a map is being rendered, the following needs to be added:

“Contains National Statistics data © Crown copyright and database right [2015]” and “Contains Ordnance Survey data © Crown copyright and database right [2015]”

We can also add a new column that contains the centroid for each borough:

boroughs_geo_df = boroughs_geo_df.assign(centroid = boroughs_geo_df["geometry"].centroid)

Now two columns (geometry and centroid) will contain Geospatial data. These two columns need to be converted back to String using wkt.dumps so that we can write the data correctly into SingleStore:

boroughs_geo_df["geometry_wkt"] = boroughs_geo_df["geometry"].apply(wkt.dumps)
boroughs_geo_df["centroid_wkt"] = boroughs_geo_df["centroid"].apply(wkt.dumps)
boroughs_geo_df.drop(columns = ["geometry", "centroid"], inplace = True)
boroughs_geo_df.rename(columns = {"geometry_wkt": "geometry", "centroid_wkt": "centroid"}, inplace = True)

And now, we set up the connection to SingleStore:

from sqlalchemy import *

db_connection = create_engine(connection_url)

Finally, we are ready to write the Dataframe to SingleStore:

boroughs_geo_df.to_sql(
    "london_boroughs",
    con = db_connection,
    if_exists = "append",
    index = False,
    chunksize = 1000
)

This will write the Dataframe to the table called london_boroughs in the geo_db database. We can check that this table was successfully populated from SingleStore.

London Underground Data

Create the London Underground Database Tables

Now we need to focus on the London Underground data. In our SingleStore Managed Service account, let’s use the SQL Editor to create several database tables, as follows:

USE geo_db;

DROP TABLE IF EXISTS london_connections;
CREATE ROWSTORE TABLE IF NOT EXISTS london_connections (
    station1 INT,
    station2 INT,
    line     INT,
    time     INT,
    PRIMARY KEY(station1, station2, line)
);

DROP TABLE IF EXISTS london_lines;
CREATE ROWSTORE TABLE IF NOT EXISTS london_lines (
    line   INT PRIMARY KEY,
    name   VARCHAR(32),
    colour VARCHAR(8),
    stripe VARCHAR(8)
);

DROP TABLE IF EXISTS london_stations;
CREATE ROWSTORE TABLE IF NOT EXISTS london_stations (
    id          INT PRIMARY KEY,
    latitude    DOUBLE,
    longitude   DOUBLE,
    name        VARCHAR(32),
    zone        FLOAT,
    total_lines INT,
    rail        INT,
    geometry AS GEOGRAPHY_POINT(longitude, latitude) PERSISTED GEOGRAPHYPOINT,
    INDEX(geometry)
);

We have three tables. The london_connections table contains pairs of stations that are connected by a particular line. Later, we’ll use the time column to determine the shortest path.

The london_lines table has a unique identifier for each line and contains information such as the line name and colour.

The london_stations table contains information about each station, such as its latitude and longitude. As we upload the data into this table, SingleStore will create and populate the geometry column for us. This is a Geospatial Point consisting of longitude and latitude. This will be very useful when we want to start asking Geospatial queries. We’ll make use of this feature later.

Data Loader for London Underground

Since we already have the CSV files in the correct format for each of the three tables, loading the data into SingleStore is easy. Let’s now create a new Python notebook. We’ll call it data_loader_for_london_underground.

In a new code cell, let’s add the following code:

connections_csv_url = "https://raw.githubusercontent.com/VeryFatBoy/singlestore-geospatial-example/refs/heads/main/datasets/london_connections.csv"

connections_df = pd.read_csv(connections_csv_url)

This will load the connections data. We’ll repeat this for lines:

lines_csv_url = "https://raw.githubusercontent.com/VeryFatBoy/singlestore-geospatial-example/refs/heads/main/datasets/london_lines.csv"

lines_df = pd.read_csv(lines_csv_url)

and stations:

stations_csv_url = "https://raw.githubusercontent.com/VeryFatBoy/singlestore-geospatial-example/refs/heads/main/datasets/london_stations.csv"

stations_df = pd.read_csv(stations_csv_url)

We’ll drop the display_name column as we don’t require it:

stations_df.drop(columns = ["display_name"], inplace = True)

And now, we’ll set up the connection to SingleStore:

from sqlalchemy import *

db_connection = create_engine(connection_url)

Finally, we are ready to write the Dataframes to SingleStore:

connections_df.to_sql(
    "london_connections",
    con = db_connection,
    if_exists = "append",
    index = False,
    chunksize = 1000
)

This will write the Dataframe to the table called london_connections in the geo_db database. We’ll repeat this for lines:

lines_df.to_sql(
    "london_lines",
    con = db_connection,
    if_exists = "append",
    index = False,
    chunksize = 1000
)

and stations:

stations_df.to_sql(
    "london_stations",
    con = db_connection,
    if_exists = "append",
    index = False,
    chunksize = 1000
)

We can check that these tables were successfully populated from SingleStore.

Example Queries

Now that we have built our system, we can run some queries. SingleStore supports a range of very useful functions for working with Geospatial data. Figure 8 shows these functions, and we’ll work through each of these with an example.

Figure 8. Geospatial Functions (Source: SingleStore).

Area (GEOGRAPHY_AREA)

This measures the square metre area of a Polygon.

We can find the area of a London Borough in square metres. In this case, we are using Merton:

SELECT ROUND(GEOGRAPHY_AREA(geometry), 0) AS sqm
FROM london_boroughs
WHERE name = "Merton";

The output should be:

+----------+
| sqm      |
+----------+
| 37456562 |
+----------+

Since we also have hectares being stored for each borough, we can compare the result with the hectares, and the numbers are close. It is not a perfect match since the Polygon data for the borough is storing a limited number of points, so the calculated area will be different. If we stored more data points, the accuracy would improve.

Distance (GEOGRAPHY_DISTANCE)

This measures the shortest distance between two Geospatial objects, in metres. The function uses the standard metric for distance on a sphere.

We can find how far each London Borough is from a particular Borough. In this case, we are using Merton:

SELECT b.name AS neighbour, ROUND(GEOGRAPHY_DISTANCE(a.geometry, b.geometry), 0) AS distance_from_border
FROM london_boroughs a, london_boroughs b
WHERE a.name = "Merton"
ORDER BY distance_from_border
LIMIT 10;

The output should be:

+------------------------+----------------------+
| neighbour              | distance_from_border |
+------------------------+----------------------+
| Merton                 |                    0 |
| Sutton                 |                    0 |
| Croydon                |                    0 |
| Wandsworth             |                    0 |
| Kingston upon Thames   |                    0 |
| Lambeth                |                    0 |
| Richmond upon Thames   |                  552 |
| Hammersmith and Fulham |                 2609 |
| Bromley                |                 3263 |
| Southwark              |                 3276 |
+------------------------+----------------------+

Length (GEOGRAPHY_LENGTH)

This measures the length of a path. The path could also be the total perimeter of a Polygon. Measurement is in metres.

Here we calculate the perimeter for London Boroughs and order the result by the longest first.

SELECT name, ROUND(GEOGRAPHY_LENGTH(geometry), 0) AS perimeter
FROM london_boroughs
ORDER BY perimeter DESC
LIMIT 5;

The output should be:

+----------------------+-----------+
| name                 | perimeter |
+----------------------+-----------+
| Bromley              |     76001 |
| Richmond upon Thames |     65102 |
| Hillingdon           |     63756 |
| Havering             |     63412 |
| Hounslow             |     58861 |
+----------------------+-----------+

Contains (GEOGRAPHY_CONTAINS)

This determines if one object is entirely within another object.

In this example, we are trying to find all the London Underground stations within Merton:

SELECT b.name
FROM london_boroughs a, london_stations b
WHERE GEOGRAPHY_CONTAINS(a.geometry, b.geometry) AND a.name = "Merton"
ORDER BY name;

The output should be:

+-----------------+
| name            |
+-----------------+
| Colliers Wood   |
| Morden          |
| South Wimbledon |
| Wimbledon       |
| Wimbledon Park  |
+-----------------+

Intersects (GEOGRAPHY_INTERSECTS)

This determines whether there is any overlap between two Geospatial objects.

In this example, we are trying to determine which London Borough Morden station intersects:

SELECT a.name
FROM london_boroughs a, london_stations b
WHERE GEOGRAPHY_INTERSECTS(b.geometry, a.geometry) AND b.name = "Morden";

The output should be:

+--------+
| name   |
+--------+
| Merton |
+--------+

Approx. Intersects (APPROX_GEOGRAPHY_INTERSECTS)

This is a fast approximation of the previous function.

SELECT a.name
FROM london_boroughs a, london_stations b
WHERE APPROX_GEOGRAPHY_INTERSECTS(b.geometry, a.geometry) AND b.name = "Morden";

The output should be:

+--------+
| name   |
+--------+
| Merton |
+--------+

Within Distance (GEOGRAPHY_WITHIN_DISTANCE)

This determines whether two Geospatial objects are within a certain distance of each other. Measurement is in metres.

In the following example, we try to find any London Underground stations within 100 metres of a centroid.

SELECT a.name
FROM london_stations a, london_boroughs b
WHERE GEOGRAPHY_WITHIN_DISTANCE(a.geometry, b.centroid, 100)
ORDER BY name;

The output should be:

+------------------------+
| name                   |
+------------------------+
| High Street Kensington |
+------------------------+

Visualisation

Map of the London Underground

In our SingleStore database, we have stored Geospatial data. We can use that data to create visualisations. To begin with, let’s create a graph of the London Underground network.

We’ll start by creating a new Python notebook. We’ll call it shortest_path.

In a new code cell, let’s add the following code to import several libraries:

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
import folium

from folium import plugins
from singlestoredb import notebook as nb

And now, we’ll set up the connection to SingleStore:

from sqlalchemy import *

db_connection = create_engine(connection_url)

We’ll read the data from the three London Underground tables into Dataframes:

connections_df = pd.read_sql(
    "SELECT * FROM london_connections",
    con = db_connection
)

lines_df = pd.read_sql(
    "SELECT * FROM london_lines",
    con = db_connection
)

stations_df = pd.read_sql(
    "SELECT * FROM london_stations",
    con = db_connection
)

Next, we’ll build a graph using NetworkX. The following code was inspired by an example on GitHub. The code creates nodes and edges to represent stations and the connections between them:

graph = nx.Graph()

for station_id, station in stations_df.iterrows():
    graph.add_node(
        station["name"],
        lon = station["longitude"],
        lat = station["latitude"],
        s_id = station["id"]
    )

for connection_id, connection in connections_df.iterrows():
    station1_name = stations_df.loc[stations_df["id"] == connection["station1"], "name"].item()
    station2_name = stations_df.loc[stations_df["id"] == connection["station2"], "name"].item()
    graph.add_edge(
        station1_name,
        station2_name,
        time = connection["time"],
        line = connection["line"]
    )

We can check the number of nodes and edges, as follows:

len(graph.nodes()), len(graph.edges())

The output should be:

(302, 349)

Next, we’ll get the node positions. The following code was inspired by an example on DataCamp.

node_positions = {node[0]: (node[1]["lon"], node[1]["lat"]) for node in graph.nodes(data = True)}

And we can check the values:

dict(list(node_positions.items())[0:5])

The output should be similar to:

{'Becontree': (0.127, 51.5403),
 'Blackfriars': (-0.1031, 51.512),
 'Bounds Green': (-0.1243, 51.6071),
 'Caledonian Road': (-0.1188, 51.5481),
 'Canary Wharf': (-0.0209, 51.5051)}

We’ll now get the lines that connect stations:

edge_lines = [edge[2]["line"] for edge in graph.edges(data = True)]

And we can check the values:

edge_lines[0:5]

The output should be similar to:

[4, 4, 4, 3, 10]

From this information, we can find the line colour:

edge_colours = [lines_df.loc[lines_df["line"] == line, "colour"].iloc[0] for line in edge_lines]

And we can check the values:

edge_colours[0:5]

The output should be similar to:

['#00782A', '#00782A', '#00782A', '#FFD300', '#003688']

Now we can create a plot, as follows:

plt.figure(figsize = (12, 12))

nx.draw(
    graph,
    pos = node_positions,
    edge_color = edge_colours,
    node_size = 20,
    node_color = "black",
    width = 3
)

plt.title("Map of the London Underground", size = 20)
plt.show()

This produces the image shown in Figure 9.

Figure 9. Map of the London Underground.

We can also represent the graph as a Dataframe. The following code was inspired by an example on GitHub.

data = []

lons, lats = map(nx.get_node_attributes, [graph, graph], ["lon", "lat"])
lines, times = map(nx.get_edge_attributes, [graph, graph], ["line", "time"])

for edge in list(graph.edges()):
    data.append({
        "station_from": edge[0],
        "lon_from": lons.get(edge[0]),
        "lat_from": lats.get(edge[0]),
        "station_to": edge[1],
        "lon_to": lons.get(edge[1]),
        "lat_to": lats.get(edge[1]),
        "line": lines.get(edge),
        "time": times.get(edge)
    })

network_df = pd.DataFrame(data)

If we now merge this Dataframe with the London Underground lines, it gives us a complete picture of stations, coordinates and lines between stations.

network_df = pd.merge(network_df, lines_df, how = "left", on = "line")

If we wish, this could now be stored back into SingleStore for future use. We can also visualise this using Folium, as follows:

London = [51.509865, -0.118092]

m = folium.Map(
    location = London,
    zoom_start = 12
)

for i in range(0, len(stations_df)):
    folium.Marker(
        location = [stations_df.iloc[i]["latitude"], stations_df.iloc[i]["longitude"]],
        popup = stations_df.iloc[i]["name"],
    ).add_to(m)

for i in range(0, len(network_df)):
    folium.PolyLine(
        locations = [
            (network_df.iloc[i]["lat_from"], network_df.iloc[i]["lon_from"]),
            (network_df.iloc[i]["lat_to"], network_df.iloc[i]["lon_to"])
        ],
        color = network_df.iloc[i]["colour"],
        weight = 3,
        opacity = 1
    ).add_to(m)

plugins.Fullscreen(
    position = "topright",
    title = "Fullscreen",
    title_cancel = "Exit",
    force_separate_button = True
).add_to(m)

html_content = m._repr_html_()

We’ll save the map to Stage and then we can download it locally:

with nb.stage.open("lu_map.html", "w") as st:
    st.write(html_content)

This produces a map, as shown in Figure 10. We can scroll and zoom the map. When clicked, a marker will show the station name and the lines are coloured according to the London Underground scheme.

Shortest Path

We can also use the graph for more practical purposes. For example, by finding the shortest path between two stations.

We can use a built-in feature of NetworkX called shortest_path. Here we are looking to travel from Oxford Circus to Canary Wharf:

shortest_path = nx.shortest_path(graph, "Oxford Circus", "Canary Wharf", weight = "time")

We can check the route:

shortest_path

The output should be:

['Oxford Circus',
 'Tottenham Court Road',
 'Holborn',
 'Chancery Lane',
 "St. Paul's",
 'Bank',
 'Shadwell',
 'Wapping',
 'Rotherhithe',
 'Canada Water',
 'Canary Wharf']

To visualise the route, we can convert this to a Dataframe:

shortest_path_df = pd.DataFrame({"name" : shortest_path})

And then merge it with the station's data so that we can get the Geospatial data:

merged_df = pd.merge(shortest_path_df, stations_df, how = "left", on = "name")

We can now create a map using Folium, as follows:

m = folium.Map()

sw = merged_df[["latitude", "longitude"]].min().values.tolist()
ne = merged_df[["latitude", "longitude"]].max().values.tolist()

m.fit_bounds([sw, ne])

for i in range(0, len(merged_df)):
    folium.Marker(
        location = [merged_df.iloc[i]["latitude"], merged_df.iloc[i]["longitude"]],
        popup = merged_df.iloc[i]["name"],
    ).add_to(m)

points = tuple(zip(merged_df.latitude, merged_df.longitude))

folium.PolyLine(
    points,
    color = "red",
    weight = 3,
    opacity = 1
).add_to(m)

plugins.Fullscreen(
    position = "topright",
    title = "Fullscreen",
    title_cancel = "Exit",
    force_separate_button = True
).add_to(m)

html_content = m._repr_html_()

We’ll save the map to Stage and then we can download it locally:

with nb.stage.open("shortest_route_map.html", "w") as st:
    st.write(html_content)

This produces a map, as shown in Figure 11. We can scroll and zoom the map. When clicked, a marker will show the station name.

Bonus: Streamlit Visualisation

We can use Streamlit to create a small application that allows us to select start and end stations for a journey on the London Underground, and the application will find the shortest path.

Install the Required Software

We need to install the following packages:

streamlit
streamlit-folium
folium
mysqlclient
networkx
pandas
sqlalchemy

These can be found in the requirements.txt file on GitHub. Run the file as follows:

pip install -r requirements.txt

Example Application

Here is the complete code listing for streamlit_app.py:

# streamlit_app.py

import streamlit as st
import folium
import networkx as nx
import pandas as pd
import sqlalchemy
from streamlit_folium import st_folium

# Initialise connection.
conn = st.connection("singlestore", type = "sql")

stmt1 = """
SELECT station1, station2, time
FROM london_connections;
"""
connections_df = conn.query(stmt1)

stmt2 = """
SELECT id, name, latitude, longitude
FROM london_stations
ORDER BY name;
"""
stations_df = conn.query(stmt2)
stations_df.set_index("id", inplace = True)

st.subheader("Shortest Path")
from_name = st.sidebar.selectbox("From", stations_df["name"])
to_name = st.sidebar.selectbox("To", stations_df["name"])

graph = nx.Graph()
graph.add_weighted_edges_from(
    [(stations_df.loc[conn["station1"], "name"], stations_df.loc[conn["station2"], "name"], conn["time"]) 
     for _, conn in connections_df.iterrows()]
)

shortest_path = nx.shortest_path(graph, from_name, to_name, weight = "time")
shortest_path_df = pd.DataFrame({"name": shortest_path})

merged_df = shortest_path_df.join(stations_df.set_index("name"), on = "name")

initial_location = [merged_df.iloc[0]["latitude"], merged_df.iloc[0]["longitude"]]
m = folium.Map(location = initial_location, zoom_start = 13)

sw = merged_df[["latitude", "longitude"]].min().values.tolist()
ne = merged_df[["latitude", "longitude"]].max().values.tolist()
m.fit_bounds([sw, ne])

for i, row in merged_df.iterrows():
    folium.Marker(
        location = [row["latitude"], row["longitude"]],
        popup = row["name"]
    ).add_to(m)

folium.PolyLine(
    locations = merged_df[["latitude", "longitude"]].values.tolist(),
    color = "red",
    weight = 3,
    opacity = 1
).add_to(m)

st_folium(m)

st.sidebar.write("Your Journey", shortest_path_df)

Create a Secrets file

Our local Streamlit application will read secrets from a file .streamlit/secrets.toml in our applications root directory. We need to create this file as follows:

# .streamlit/secrets.toml

[connections.singlestore]
dialect = "mysql"
host = "<host>"
port = 3306
database = "geo_db"
username = "admin"
password = "<password>"

The <host> and <password> should be replaced with the values obtained from the SingleStore Managed Service when creating a cluster.

Run the Code

We can run the Streamlit application as follows:

streamlit run streamlit_app.py

The output in a web browser should look like Figure 12.

Feel free to experiment with the code to suit your needs.

Summary

In this article, we have seen a range of very powerful Geospatial functions supported by SingleStore. Through examples, we have seen these functions working over our Geospatial data. We have also seen how we can create graph structures and query those through various libraries. These libraries, combined with SingleStore, enable modelling and querying of graph structures with ease.

There are several improvements that we could make:

The data on the London Underground needs updating. New stations and line extensions have been added to the network recently.
We could also add additional transport modes, such as the London Tram Network.
We could also add additional connection information about the transport network. For example, some stations may not be directly connected but may be within a short walking distance.
Our visualisation of the various Underground lines could also be improved since any route served by multiple lines only shows one of the lines.
The shortest path is calculated on static data. It would be beneficial to extend our code to include real-time updates to the transport network to allow for delays.

Acknowledgements

This article would not have been possible without the examples provided by other authors and developers.

There is a wonderful quote attributed to Sir Isaac Newton:

If I have seen further it is by standing on the shoulders of Giants.

✅

Using SingleStore as a Geospatial Database

Abstract

Introduction

London Boroughs Data

Convert London Boroughs Data

Create the London Boroughs Database Table

Data Loader for London Boroughs

London Underground Data

Create the London Underground Database Tables

Data Loader for London Underground

Example Queries

Area (GEOGRAPHY_AREA)

Distance (GEOGRAPHY_DISTANCE)

Length (GEOGRAPHY_LENGTH)

Contains (GEOGRAPHY_CONTAINS)

Intersects (GEOGRAPHY_INTERSECTS)

Approx. Intersects (APPROX_GEOGRAPHY_INTERSECTS)

Within Distance (GEOGRAPHY_WITHIN_DISTANCE)

Visualisation

Map of the London Underground

Shortest Path

Bonus: Streamlit Visualisation

Install the Required Software

Example Application

Create a Secrets file

Run the Code

Summary

Acknowledgements

Written by Akmal Chaudhri