The concept of six degrees of separation / bacon numbers is a very interesting idea, but I believe the notation of “degrees” to measure distance in social networks is not the best approach. I was first introduced to the small-world phenomenon by Derek Muller in his YouTube video:
If you don’t know what “degrees of separation” are, I suggest you watch the video, because it explains the concept very well.
When analysing the network of users of an online social network like ePotato, there are no “friends” like in an actual social network. Instead, there are people a user follows, but of course this does not mean that said user is also being followed by that person. So “friendships” here are directed. Because of this, there are four different kinds of separation:
- A → B: “How many steps do you have to take to get from user A to user B when only following the direction of the arrows?” (With arrows pointing from a user to his/her contacts)
- B → A
- Loose Separation: Similar, but the directions of the arrows are ignored, so when “walking” from A to B, you can walk with or against the direction of the arrows. This also means that B → A will be the same as A → B.
- Tight Separation: Also similar, but now all connections that are not mutual are completely ignored, so only mutual friends count. Of course this again means that A → B = B → A.
To loosen a network means to convert every connection into a bidirectional connection (remove the arrow-heads) and to tighten a network means to remove all non-bidirectional connections, so only mutual connections remain.
Degrees of separation are nice, but they don’t tell you the whole picture. If user A is directly connected to user B, but they have no friends in common, the separation is the same as if they had 100 friends in common. If you really want to know how connected two users are, this information is important, because obviously the users that have 100 friends in common are more connected.
This kind of connection can be calculated by looking at the network as an electrical network. If every connection in the network is defined to be an electrical resistor of 1Ω, the “connectedness” between two users is the electrical resistance between those two nodes. In graph theory, this property is called resistance distance, which is a great name, because it rhymes.
Resistance distance can also be a directed value, however because it is very difficult to simplify a large network, which is in essence the process of calculating the resistance between two nodes, most directed resistances calculated are really only “semi-directed”, i.e. during the calculation the network will be loosened at some point, unless it is an especially simple network.
If you didn’t come here from the website, click here to test it out and to see the implementation in python.