IXPDBI — Business Intelligence on IXP open databases
At the moment, there are over 600 Internet Exchange Points (IXPs) and more than 16.700 Autonomous Systems (ASNs).
But how does this network of connections develop? What are the ASNs that have a higher number of IXP connections when compared to the average?
What are the most “interesting” ASNs in the interconnection strategy of an Internet Exchange?
To have a clearer picture, a data-driven project has been set on the APIs provided by the two main interconnection’s open databases: IXPDB and PeeringDB .
This analysis can either guide TOP-IX in its strategic interconnection choices (thanks to a better understanding of the situation) or be used to compare TOP-IX’s network with other Internet Exchanges. Finally, this analysis can be performed with particular attention to other Italian IXPs and expanded to players on the European scene.
The following preliminary analysis allowed the setting up of an actual data-driven Business Intelligence project.
THE HYPER-CONNECTED ASNs: HOW DOES THE ASN NETWORK DEVELOP BY NUMBER OF CONNECTIONS, AND WHO ARE THE “HYPER-CONNECTED”?
The list of (approximately) 16,500 ASNs with their relative number of connections to IXPs can be obtained by combining the information from PeeringDB and IXPDB. Once the list has been reordered in descending way, it is possible to give a definition of “Hyper-connected”. In the typical situation, in fact, an ASN is connected to only one IXP and the distribution deviates from the average by a few units.
The graph below (Figure 1) shows how ASNs are concentrated in the range from 0 to 40 regarding the number of connections.
Based on the analysis, all of the Autonomous Systems (ASN) connected to at least 7 IXPs can be considered “Hyper-connected”. Given this condition, there are 515 observations on the global territory and 275 in Europe (out of 6,548 ASNs).
WHAT ABOUT THE TYPE FIELD?
Another interesting metric is the type of service associated with the various ASNs (field “type” on IXPDB and PeeringDB). This could be relevant in order to represent their distribution, geographical arrangement and interconnection to the different IXPs.
The ASNs have been divided into three main groups: worldwide connections, worldwide “Hyper-connected” and European “Hyper-connected”.
The same analysis, then, can be represented in the table below where it can be seen that, among the “Hyper-connected”, there are no Route Collectors or Government, and among the European hyperconnected, there are no ASNs with type “network services”, as well.
WHERE DOES TOP-IX STAND IN THIS SCENARIO?
TOP-IX is connected to 116 ASNs, of which 37 are “Hyper-connected”. Among these, 36 are located in Europe (data updated to 06/14/2021 based on PeeringDB public information, while effective connected networks are more).
As far as the “type” field is concerned, most of the connections to TOP-IX are Cable / DSL / NSP (25) and Content (16) types. On the other hand, the unspecified ones are 15.
The following count plot shows the group of worldwide “Hyper-connected” ASNs compared to the “Hyper-connected” ones already connected to TOP-IX. They were divided by type and then analyzed (see Figure 4).
Another kind of comparison can be set by comparing different Internet Exchanges. In this analysis, MIX, NaMeX, AMS-IX and DE-CIX were considered.
The comparative analysis shows that:
- The shared (to all 5 IXPs) ASNs are only 11.
- TOP-IX has the lowest number of connections: DE-CIX has 1,060 connections, AMS-IX counts 875, MIX 345 and NaMeX 162.
The “type” distribution in (Figure 5), shows how most Italian Internet Exchanges have a combination of the Cable / DSL / ISP type, while AMS-IX and DE-CIX are more oriented on the NSP type.
By comparing the connections it is possible to identify (as a kind of evidence-based suggestion)new entities to be interconnected.
HOW DOES THE NETWORK OF CONNECTIONS DEVELOP FROM THE GEOGRAPHICAL POINT OF VIEW?
At European level, the cities with the highest number of physical interchange points (“locations”) can be listed as follow :
- Frankfurt (88)
- Amsterdam (78)
- Moscow (56)
- St. Petersburg (45)
- London (36)
On the other hand the list of IXPs with the highest number of locations sees in first place NetIX with 139 locations, followed by DATAIX (124), NL-ix (80), AMS-IX (25) and DE-CIX Frankfurt (22).
WHAT IS THE ITALIAN SITUATION?
Currently, in Italy, there are 19 locations divided between three reference IXPs:
- MIX (9)
- TOP-IX (7)
- NaMeX (3)
(*) Even in this case data are based on PeeringDB public information, for instances TOP-IX “real” locations are more than 20.
These locations are mainly concentrated in Lombardia, but they are also located in Piemonte, Lazio and Sicilia.
HOW TO ENHANCE THE ANALYSIS
In the analytical phase, several datasets were created by merging the data obtained from the APIs and subsequently analyzed with specific Data Science tools such as Jupyter Notebook and Google Colaboratory.
The final step of the research work explored an automatic process to generate the data frames by importing them from a database created using PhpMyAdmin to engineer the future queries.
A web dashboard to query participants and Internet Exchanges was developed as well as new APIs, using CSS, HTML and JavaScript for the front-end, and Python language and the micro-framework Flask for the back-end (see Figure 8, 9 and 10 ).
The research project was developed from February to June 2021 thanks to the collaboration with the ENGIM “IFTS-Database design and management techniques 2020/21” course and the TOP-IX team.
The work was divided into two phases.
- The first phase took place during the schoolwork alternation period (February-April). The group of students, consisting of five elements, focused on data exploration and related analysis, supported by the TOP-IX mentors of the course.
- The second phase, developed during the full-time internship period in the company (May-June), focused instead on data engineering and resulted in prototyping the above web dashboard.
The working group
ENGIM Team: Veronica Cantoro, Nathan Essofack, Simona Pandinu.
(Luca Santoro, Gabriel Pignatiello also participated in the first schoolwork alternation period)
TOP-IX Team: Stefania Delprete (Data Scientist), Christian Racca (Senior Engineer), Andrea Beccaris (Full Stack Developer), Massimo Santoli (Developer) and Laura Pippinato (Designer).