Turning Metro System into Nodes and Edges

Pengtong Yang
INST414: Data Science Techniques
3 min readFeb 25, 2022

After learning about networks and graphs, I wonder how will I apply them to the real world problem. One of ways that I figured out is to use the structure of the metro system as in numbers of stations and path had change from year to year. Specifically, comparing metro system in forms of networks with nodes and edges. The sources sites for the data are metro systems and Athens metro system in 2020. In this particular assignment, I used NetworkX to facilitate my network analysis. I will be comparing the numbers of nodes and edges of metro system in Athens in 2008 and 2020. The Libraries that I used are newtworkx and pandas. I used draw_shell feature in the NetworkX library along with the g.add_node and g.add_edge created the main graphics.

Athens is the capital of Greece, the metro system in 2008 appeared to be triangle shape in the middle. Total of 9 nodes and 9 edges for Athens metro system in 2008:

Shell graph of Athens metro 2008

Athens metro in 2020 had total of 12 nodes and 13 edges for Athens metro system in 2020:

Shell graph of Athens metro 2020

I labeled the important nodes as following:

Nodes:
Node: (0, {'label': 'Main_1'})
Node: (1, {})
Node: (2, {})
Node: (3, {})
Node: (4, {})
Node: (5, {'label': 'Main_2'})
Node: (6, {})
Node: (7, {})
Node: (8, {'label': 'Main_3'})
Node: (9, {'label': 'Main_4'})
Node: (10, {})
Node: (11, {})

Table for the nodes and edges:

Table interpret nodes to stations and edges to path:

Bugs encounter: First bug, the g.nodes doesn’t print anything. After several attempts, I had missed few steps, which I didn’t assign nodes dictionary to node attributes. After correctly assigning, I was able to print out the nodes, and the edges, with important nodes with labels. Second bug, nodes and edges miscounted, instead of counting 9 nodes, and 9 edges. I ended up counting 10 nodes, and 10 edges. I figured that the count start from 0, therefore the last node or edge will be 8 instead of of 9.

One of obvious limitations are the amount data sources to be collected, as I search on the internet, they were very difficult to find at one place, I had to find them in multiple sites, which wasted tremendous amount of time. Another limitation will be the level of my coding skill, more specifically, my python coding level is between beginner and intermediate. I had plenty ideas, but couldn't figure out ways to implement them.

My takeaways are the the bugs that I encounter and being able to fix them. The deeper understand of nodes and edges, as I can apply them into useful information to be shown. I also learned the ways of summarizing the network, draw the graph using networkx. I had become more knowledgeable about the network, nodes, and edges with a complete form of network analysis.

--

--