System Design Interview: PathFinder, Apple Maps, Google Maps, Navigation
Check out ByteByteGo’s popular System Design Interview Course
PathFinder is a mobile app that offers navigation using maps, live traffic updates, and step-by-step guidance. Let’s explore how its backend system is designed to support these features efficiently and at scale.
Land a higher salary with Grokking Comp Negotiation in Tech
Requirements and Design Scope
Before initiating the design process, let’s outline the scope of the app:
- The app targets 1 billion daily active users (DAUs).
- The app features location tracking, navigation, estimated time of arrival (ETA) calculation, and map display.
- The app supports diverse transportation options such as driving, walking, and public transit.
- For simplicity, features like multi-stop directions and business photos are excluded from the current scope.
- The app utilizes road data in terabytes (TBs) of road data collected from diverse sources.
Non-functional Requirements
- Navigation accuracy is critical; users must not be misled with incorrect routing information.
- Map rendering should be seamless to deliver a high-quality navigation experience.
- Considering mobile device constraints, the app should be lightweight and optimized for minimal data usage and power consumption.
- The system should ensure high availability and scalability.
Basic Concepts and Terminology
Geographical Coordinates
The Earth is a rotating sphere, with the North Pole at the top and the South Pole at the bottom.
Latitude (Lat) measures how far north or south a point is from the equator, while Longitude (Long) indicates how far east or west it is.
Map Projection
Map projection is the process of translating the Earth’s curved surface onto a flat plane. Since a 3D sphere cannot be represented on a 2D map without distortion, every projection has limitations.
We will employ the Web Mercator projection — a simplified variant of the Mercator method, also used by Google Maps — which preserves shapes and angles, making it well-suited for navigation. While it distorts areas near the poles, its computational efficiency makes it ideal for interactive online maps.
Don’t waste hours on Leetcode. Learn patterns with the course Grokking the Coding Interview: Patterns for Coding Questions.
Tile-Based Map Rendering
To efficiently render maps, the world is divided into smaller tiles instead of a single large image. Each zoom level corresponds to a specific set of tiles, allowing the map to display the appropriate level of detail while conserving bandwidth. The client loads and assembles only the relevant tiles for the visible area. Lower zoom levels use fewer, larger tiles to avoid unnecessary detail, while higher zoom levels load more detailed tiles for finer granularity.
Geocoding
Geocoding maps an address like “221B Baker Street, London” into geographic coordinates — for example, (51.5237° N, 0.1586° W). Conversely, reverse geocoding yields the address corresponding to a given latitude and longitude pair.
Interpolation is a common geocoding technique that estimates the location of addresses lacking precise coordinates by referencing nearby known points. It uses data from sources like Geographic Information Systems (GIS) that model streets and address layouts, enabling faster geocoding with reasonable accuracy.
Get insights on designing machine learning systems with Grokking Machine Learning Design.
Geohashing
Geohashing encodes geographic coordinates into short alphanumeric strings, each representing a rectangular area on the Earth’s surface. It works by recursively dividing the globe into grids, alternating splits by latitude and longitude, and encoding each step using Base32.
The length of the geohash determines the spatial resolution — longer hashes offer finer precision. A key feature is that nearby points often share prefixes, making geohashing efficient for spatial indexing, proximity searches, and geo-based sharding.
We will use geohashing for map tiling, where each hash acts as a unique tile identifier. This allows the system to load only relevant tiles based on the current viewport and zoom level, and to cluster nearby locations under shared prefixes for faster lookup, caching, and rendering.
Road Network Modeling
Routing algorithms often extend Dijkstra’s or A pathfinding algorithms, working on graphs where roads are edges and intersections are nodes. Since performance degrades with graph size, modeling the entire world as one graph is memory-intensive and inefficient.
We will divide the map into smaller routing tiles using geohashing. Each tile contains binary data representing the localized graph of roads and intersections within its bounds. Tiles maintain references to adjacent tiles, allowing the algorithm to dynamically piece together the larger road network during traversal. Loading tiles on demand reduces memory usage and improves performance.
Get a leg up on your competition with the Grokking the Advanced System Design Interview course and land that dream job! Don’t waste hours on Leetcode. Learn patterns with the course Grokking the Coding Interview: Patterns for Coding Questions.
Efficient navigation requires routing data at appropriate levels of detail, so routing tiles are organized hierarchically into three levels: local tiles capture neighborhood streets, regional tiles focus on arterial roads connecting districts, and national tiles cover major highways linking cities and states. This hierarchy enables scalable and efficient routing across varying distances.
Back-of-the-envelope Estimation
With a foundation in place, let us perform a rough estimation that accounts for data usage and battery consumption on mobile devices.
Storage Requirements
We need to account for three main categories of data:
- Global Map Data: A detailed estimation follows below.
- Metadata: Since metadata per tile is minimal, we can safely ignore it in our calculations.
- Road Network Data: Since external sources provide terabytes of road data, the converted routing tiles are also expected to consume terabytes of storage.
Estimating Map Tile Storage Requirements
As discussed earlier, map tiles are generated for each zoom level. Growth in tile count with increasing zoom levels is demonstrated below.
Zoom Level Tile Count
0 1
1 4
2 16
3 64
4 256
5 1024
6 4096
7 16384
8 65536
9 262144
10 1048576
11 4194304
12 16777216
13 67108864
14 268435456
15 1073741824
16 4294967296
17 17179869184
18 68719476736
19 274877906944
20 1099511627776
21 4398046511104To estimate total storage, we begin with zoom level 21 — the highest and most detailed — which contains approximately 4.3 trillion tiles. Assuming each tile is a 256×256 pixel compressed PNG image averaging 100 KB, this level alone would require about 4.3 trillion × 100 KB = 430 petabytes (PB) of storage.
Land a higher salary with Grokking Comp Negotiation in Tech.
Given that 90% of the planet consists of oceans, deserts, mountains, and other uninhabited regions — which don’t require detailed imagery and are highly compressible — we can reduce the storage estimate for zoom level 21 by 90%, resulting in about 44 PB. For simplicity, we round this to 50 PB.
Next, we estimate the contribution of lower zoom levels. Each level down reduces tile count by a factor of 4, both north-south and east-west. This forms a geometric series:
50 + 50/4 + 50/16 + 50/64 + …, which converges to approximately 67 PB.
Thus, storing the full set of map tiles would require somewhere between 70 to 100 petabytes, depending on compression strategies and coverage assumptions.
If you are interviewing, consider buying our number#1 course for Java Multithreading Interviews.
Estimating Server Load for Location Updates
The system will handle two types of client requests:
- Navigation session requests, sent when a user starts a route.
- Location updates, transmitted during navigation to track user movement and enable features like live traffic and ETA calculations.
Assuming 1 billion daily active users, each using navigation for 35 minutes per week, total usage amounts to 35 billion navigation minutes per week — or about 5 billion minutes of active navigation per day.
If each client sent GPS updates every second, that would result in:
5 billion minutes × 60 seconds = 300 billion updates per day.
Dividing this by 86,400 seconds in a day yields approximately 3.47 million QPS (queries per second), which we round to 3 million for simplicity.
To reduce server load, updates can be batched on the client and sent less frequently (e.g., every 15 seconds). This lowers the average QPS to:
3 million ÷ 15 = 200,000 QPS.
Assuming a 5× peak load, the system should be designed to handle up to 1 million QPS (200,000 × 5) during peak usage.
Design Components
Next, let’s focus on the core design components: Routing Tile Processing Service, Location Service, Navigation Service, and Map Rendering. The following sections cover each component in detail.
Routing Tile Processing Service
The initial road dataset, sourced from various authorities, spans terabytes of data and is continuously updated with user location updates. Although it includes detailed road coverage and metadata such as names, counties, and geographic coordinates, it isn’t structured as a graph and cannot be used directly by routing algorithms.
To make it usable, we run an offline processing pipeline — called the routing tile processing service — that converts this raw dataset into routing tiles. The service runs at regular intervals to reflect the latest road changes.
The routing tile processing service produces a set of routing tiles, organized at three different resolutions (as outlined in the Road Network Modeling section). Each tile encodes a subgraph containing nodes and edges that represent intersections and road segments within its geographic area. Tiles also include references to neighboring tiles. Combined, these tiles form an interconnected road network that routing algorithms can traverse incrementally.
If you are preparing for tech interviews at FANGs, you may want to check out the course Grokking the System Design Interview by Educative.
Efficient storage of routing tiles is critical. Although graph data is often represented as adjacency lists in memory, retaining all tiles in memory isn’t feasible at this scale. While a database could be used to persist them, it would act merely as a passive storage layer — introducing unnecessary complexity and cost, especially since no database features are required.
A better alternative is to serialize each tile — structured as an adjacency list — into a compact binary format and store it in object storage such as Amazon S3. These tiles are then extensively cached on the machines running the routing service to ensure low-latency access. Organizing them by geohash enables fast, coordinate-based retrieval.
In the Navigation Service section, we describe how the shortest path service consumes these routing tiles to compute routes.
Location Service
The location service records user location updates. Clients collect location data every t seconds (where t is configurable, typically 1 second) and transmit it to the server in batches — for example, every 15 seconds — to reduce network traffic. Communication is efficiently handled using HTTP with keep-alive. Clients send updates via a POST /v1/locations request containing a JSON array of {latitude, longitude, timestamp} objects.
These periodic updates are useful both in real time, for better ETA predictions and traffic rerouting, and in the long term, for traffic monitoring, detecting road changes, and personalization.
Given the high write volume from millions of users, the User Location Database must be scalable, write-optimized, and highly available under heavy load. We use Cassandra, a NoSQL store that distributes data across multiple nodes to eliminate single points of failure and ensure consistent performance at scale. Cassandra also supports efficient time-series data modeling, allowing each user’s location records to be grouped together and stored in chronological order.
If you are interviewing, consider buying our number#1 course for Java Multithreading Interviews.
The primary key in our schema is defined as (user_id, timestamp). The user_id is the partition key, which groups all of a user’s location records together so they can be retrieved quickly. Within each partition, the timestamp serves as the clustering column, ensuring records are stored in order by time. This structure allows efficient access to a user’s locations over any time interval.
The following table demonstrates a sample record for a user.
user_id | timestamp | lat | long | user_mode | navigation_mode
82 | 165400504 | 34.105 | -118.320 | active | drivingUser location data plays a crucial role in improving the overall navigation experience. It helps identify changes in the road network, enhances map accuracy, and supports live traffic updates. Therefore, we also publish location updates to the message queue Kafka — a unified, high-throughput, low-latency platform for real-time data streaming.
Don’t waste hours on Leetcode. Learn patterns with the course Grokking the Coding Interview: Patterns for Coding Questions.
Various downstream services consume this data stream. The traffic update service processes location feeds to keep the traffic database current, while the previously discussed routing tile processing service analyzes the data to detect road changes and refresh routing tiles. The machine learning service uses them to personalize recommendations and update the personalization database. Finally, the analytics pipeline ingests the stream to populate the analytics database with insights about usage patterns and mobility trends. Other applications can also tap into the stream to power additional features.
Navigation Service
The navigation service determines a dependable route from origin to destination. We prioritize reliability over selecting the absolute fastest path.
Users initiate navigation by sending an HTTP request through a load balancer to the navigation backend. The request includes parameters such as origin and destination. For example:
GET /v1/nav?origin=10+downing+street,London&destination=Heathrow+AirportThe service responds with a structured result that includes distance, duration, start and end coordinates, and step-by-step directions. It may also include metadata such as encoded polylines for map rendering and geocoded waypoints for precise location mapping.
Here’s a simplified response example:
{
"distance": {"text": "15.6 mi", "value": 25106},
"duration": {"text": "38 mins", "value": 2280},
"start_location": {"lat": 51.5034, "lng": -0.1276},
"end_location": {"lat": 51.4700, "lng": -0.4543},
"html_instructions": "Take A4 west toward Heathrow Airport",
"polyline": {"points": "sdlkFv`hfMeBfQsDbUcKnU"},
"travel_mode": "DRIVING"
}Components of the navigation service are demonstrated in the diagram below.
Let’s take a closer look at each component.
Geocoding Service
The geocoding service translates user-provided addresses — whether full street addresses or place names — into precise geographic coordinates in latitude and longitude. This step is essential because the navigation system relies on numerical coordinates rather than free-form text to calculate routes. Before generating directions, the navigation service calls the geocoding service to convert both the origin and destination into standardized coordinate pairs.
Get a leg up on your competition with the Grokking the Advanced System Design Interview course and land that dream job!
The service relies on a dedicated geocoding database to perform these lookups efficiently. Since queries are frequent and updates are rare, we use Redis — a fast, in-memory key-value store — to ensure low-latency access. By resolving location names into coordinates quickly, the geocoding service enables the route planner to compute directions with minimal delay.
Route Planner Service
The route planner service determines the most efficient path to a destination, optimizing for travel time by factoring in live traffic and road status. It also coordinates with additional services, which are described in the following sections.
Shortest Path Service
The shortest path service takes the origin and destination as latitude/longitude pairs and generates the top k shortest paths, ignoring live traffic and road closures. Because this calculation depends only on the underlying road network, which changes infrequently, caching results can be highly effective.
It uses a variation of the A* pathfinding algorithm on the road network graph built from routing tiles. The algorithm begins by converting the origin and destination coordinates into geohashes to identify the relevant tiles to load. It then traverses the graph from the origin tile toward the destination, dynamically loading adjacent or higher-level tiles as needed -either from object storage or the local cache — until it determines the set of optimal routes.
ETA Service
When the route planner obtains the set of candidate shortest paths, it queries the ETA service to estimate travel times for each route. The ETA service uses machine learning models that combine live traffic conditions with historical trends to predict arrival times.
Get a leg up on your competition with the Grokking the Advanced System Design Interview course and land that dream job!
A significant challenge is not only assessing real-time traffic but also projecting how it will transform during the next 10-20 minutes as the journey unfolds. Addressing these predictive challenges requires sophisticated algorithms, which are beyond the scope of this discussion.
Ranker Service
Once the route planner receives the ETA predictions, it forwards this information to the ranker service, which applies any user-defined filters — such as avoiding toll roads or freeways, preferring scenic or eco-friendly routes, or bypassing congestion zones. The ranker then orders the routes from fastest to slowest and forwards the top k options to the navigation service.
Updater Services
As mentioned earlier, the location service publishes user location updates to Kafka. The traffic update service and the routing tile processing service consume this stream to asynchronously refresh key data stores, including the traffic database and the routing tiles in object storage.
The routing tile processing service updates routing tiles with recent road changes, allowing the shortest path service to calculate routes that reflect the latest conditions. Meanwhile, the traffic update service captures real-time traffic patterns in the live traffic database, enabling the ETA service to deliver accurate arrival predictions.
Adaptive ETA and Rerouting
To support adaptive ETA updates and rerouting, the system must continuously monitor active navigation sessions and respond promptly to traffic changes. A key challenge is efficiently identifying which users are affected when congestion or incidents occur on specific road segments.
A simple but unscalable approach is to store each user’s full route as a list of routing tiles and scan them to detect overlap with problem areas.
A more efficient strategy is to model each active user’s route as a hierarchy of nested routing tiles instead of a flat list. For every user, we track their current routing tile, the higher-level tile that contains it, and continue recursively until their destination is also included. To determine if a user is impacted by a traffic event, we check whether the problematic tile lies within the outermost tile stored for that user. If it doesn’t, the user is unaffected.
To ensure users follow the most efficient route, the system continuously monitors all viable paths, reassesses traffic conditions for blockages or clearances, recalculates ETAs, and notifies users accordingly.
Communication Protocol
As navigation conditions evolve — due to traffic incidents, road closures, or ETA changes — the system must deliver real-time updates to clients. To achieve this, we will use WebSockets, which offer a lightweight, persistent, bidirectional connection ensuring timely communication and a responsive, dynamic navigation experience.
Map Rendering
The map rendering service is responsible for delivering detailed, responsive map imagery to users at any location and zoom level. Storing all possible tiles locally would require over a hundred petabytes of data — far exceeding the capacity of client devices. Tiles must therefore be fetched on demand from servers, based on the user’s location and zoom level. The client loads new tiles whenever the user zooms, pans to explore nearby areas, or moves into a different tile during navigation. Given the sheer volume of data, serving these tiles efficiently is critical.
Check out the course Coderust: Hacking the Coding Interview for Facebook and Google coding interviews.
One option is for the server to generate map tiles in real time. However, the endless combinations of positions and zoom levels impose heavy server load and make caching largely ineffective.
A more scalable strategy is to precompute tiles for all zoom levels and store them as static images. Each tile corresponds to a fixed rectangular grid identified by a geohash. To provide the appropriate level of detail for each viewport, we precompute tiles across 21 distinct zoom levels, similar to Google Maps. Level 0 is the most zoomed-out view, where the entire world is covered by a single 256×256 pixel tile. With each increase in zoom level, the number of tiles doubles in both dimensions, resulting in four times as many tiles overall with each tile containing 256×256 pixels. For example, at zoom level 1, there are 2×2 tiles, resulting in a combined resolution of 512×512 pixels. At zoom level 2, the grid expands to 4×4 tiles, totaling 1024×1024 pixels overall. With every increment, the total set of tiles contains four times as many pixels as the preceding level. This greater pixel count provides progressively finer detail enabling the client to render maps at the optimal level of granularity for each zoom level.
When a client needs a tile, it sends its location and zoom level to the map tile service, which returns URLs for the required tiles, including adjacent ones for smoother panning. The request passes through a load balancer, and the client downloads the tiles directly from a CDN backed by Amazon S3 cloud storage.
Map tiles are distributed via a content delivery network (CDN) — a globally distributed system of servers that cache and deliver content closer to users. The first time a tile is requested, the CDN retrieves it from the origin server and stores a copy locally. Subsequent requests, regardless of the user, are served directly from the cache, reducing latency and minimizing load on the origin. Because tiles are delivered from the nearest point of presence (POP), this approach achieves excellent scalability and performance.
Since minimizing mobile data usage is a priority, let’s estimate how much data the client consumes during a typical navigation session. Keep in mind that this calculation does not factor in client-side caching, which would significantly reduce data usage over time if users frequently travel the same routes.
Data Usage
Assume a user is moving at 30 km/h and viewing the map at a zoom level where each tile image represents a 200 m × 200 m area on the ground. Each tile is a 256×256 pixel image with an average file size of 100 KB. To cover a 1 km × 1 km section, the client needs 5 tiles across (since 1 km = 1000 m = 5 × 200 m) and 5 tiles down, totaling 25 tiles. Loading this area requires about 2.5 megabytes of data (25 tiles × 100 KB).
Because the user travels 30 km/h and each kilometer requires 2.5 MB of tiles, the total data usage over an hour is 75 MB (30 km × 2.5 MB/km), which is approximately 1.25 MB per minute (75 MB ÷ 60 minutes ≈ 1.25 MB/min).
Now that we’ve estimated the data consumption for a single user, let’s scale up to understand the aggregate load handled by the CDN.
Traffic Through CDN
As previously mentioned, we serve approximately 5 billion minutes of navigation per day. At an estimated 1.25 MB of map data per minute, this results in 6.25 billion megabytes of data delivered daily (5 billion × 1.25 MB).
Distributed over 86,400 seconds in a day, this amounts to about 72,300 MB of map data served every second (6,250,000,000 MB ÷ 86,400 ≈ 72,300 MB/s).
With map images delivered via 200 CDN Points of Presence (POPs) around the world, each POP would handle approximately 360 MB per second (72,300 MB ÷ 200).
Improving Map Delivery with Vector Tiles
While precomputed raster tiles are widely used for their simplicity and CDN cacheability, an alternative approach is to deliver vector tiles to clients. Instead of transmitting images, the system sends compact vector data — such as paths, points, and styling metadata — that the client renders dynamically using technologies like WebGL.
This approach offers several benefits. Vector data compresses far more efficiently than raster images, reducing bandwidth usage and improving load times, especially on mobile networks. It also enables smoother zooming and panning, as map elements can scale seamlessly without pixelation.
Although rendering vector tiles requires additional processing on the client side, the tradeoff is worthwhile: visual quality improves, and CDN delivery costs are significantly reduced.
Design Recap
This blog walked through the backend design of PathFinder, a navigation system built for global scale and real-time responsiveness. We explored how core features — such as route planning, ETA calculation, adaptive rerouting, and efficient map rendering — are implemented using modular services. Behind the scenes, components like the Map Tile Processing Service and Traffic Update Pipeline continuously ingest user location data to keep routing tiles and traffic conditions up to date.
We also estimated system load for a billion users, analyzed CDN-backed map delivery, and explored optimizations like vector tiles to improve performance and bandwidth efficiency. These design elements work together to provide a fast, adaptive, and reliable navigation experience.
While production systems like Google Maps are significantly more complex, the concepts outlined here offer a strong foundation for building large-scale map-based applications.
Your Comprehensive Interview Kit for Big Tech Jobs from DesignGurus
1. Grokking the System Design Interview
Learn how to prepare for system design interviews and practice common system design interview questions.
2. Grokking Dynamic Programming Patterns for Coding Interviews
Faster preparation for coding interviews.
3. Grokking the Advanced System Design Interview
Learn system design through architectural review of real systems.
4. Grokking the Coding Interview: Patterns for Coding Questions
Faster preparation for coding interviews.
5. Grokking the Object Oriented Design Interview
Learn how to prepare for object oriented design interviews and practice common object oriented design interview questions

