IXPDBI — Business Intelligence on IXP open databases

TOP-IX Craftsman LAB
topixlab
Published in
6 min readJun 28, 2021

At the moment, there are over 600 Internet Exchange Points (IXPs) and more than 16.700 Autonomous Systems (ASNs).

But how does this network of connections develop? What are the ASNs that have a higher number of IXP connections when compared to the average?

What are the most “interesting” ASNs in the interconnection strategy of an Internet Exchange?

To have a clearer picture, a data-driven project has been set on the APIs provided by the two main interconnection’s open databases: IXPDB and PeeringDB .

This analysis can either guide TOP-IX in its strategic interconnection choices (thanks to a better understanding of the situation) or be used to compare TOP-IX’s network with other Internet Exchanges. Finally, this analysis can be performed with particular attention to other Italian IXPs and expanded to players on the European scene.

The following preliminary analysis allowed the setting up of an actual data-driven Business Intelligence project.

THE HYPER-CONNECTED ASNs: HOW DOES THE ASN NETWORK DEVELOP BY NUMBER OF CONNECTIONS, AND WHO ARE THE “HYPER-CONNECTED”?

The list of (approximately) 16,500 ASNs with their relative number of connections to IXPs can be obtained by combining the information from PeeringDB and IXPDB. Once the list has been reordered in descending way, it is possible to give a definition of “Hyper-connected”. In the typical situation, in fact, an ASN is connected to only one IXP and the distribution deviates from the average by a few units.

The graph below (Figure 1) shows how ASNs are concentrated in the range from 0 to 40 regarding the number of connections.

Figure 1 - Boxplot reporting the distribution of the number of connections to IXP per ASN.

Based on the analysis, all of the Autonomous Systems (ASN) connected to at least 7 IXPs can be considered “Hyper-connected”. Given this condition, there are 515 observations on the global territory and 275 in Europe (out of 6,548 ASNs).

Figure 2 - Comparative graph (the orange color indicates the “Hyper-connected” VS the total of ASN by category). In this representation, the ASNs without connections were not taken into account.

WHAT ABOUT THE TYPE FIELD?

Another interesting metric is the type of service associated with the various ASNs (field “type” on IXPDB and PeeringDB). This could be relevant in order to represent their distribution, geographical arrangement and interconnection to the different IXPs.

The ASNs have been divided into three main groups: worldwide connections, worldwide “Hyper-connected” and European “Hyper-connected”.

Figure 3 - Graphic showing the distribution of the “type” field.

The same analysis, then, can be represented in the table below where it can be seen that, among the “Hyper-connected”, there are no Route Collectors or Government, and among the European hyperconnected, there are no ASNs with type “network services”, as well.

Table 1 - Numeric values about the “type” field.

WHERE DOES TOP-IX STAND IN THIS SCENARIO?

TOP-IX is connected to 116 ASNs, of which 37 are “Hyper-connected”. Among these, 36 are located in Europe (data updated to 06/14/2021 based on PeeringDB public information, while effective connected networks are more).

As far as the “type” field is concerned, most of the connections to TOP-IX are Cable / DSL / NSP (25) and Content (16) types. On the other hand, the unspecified ones are 15.

The following count plot shows the group of worldwide “Hyper-connected” ASNs compared to the “Hyper-connected” ones already connected to TOP-IX. They were divided by type and then analyzed (see Figure 4).

Figure 4 - “Hyper-connected” in TOP-IX VS worldwide “Hyper-connected” divided by “type”.

Another kind of comparison can be set by comparing different Internet Exchanges. In this analysis, MIX, NaMeX, AMS-IX and DE-CIX were considered.

The comparative analysis shows that:

  • The shared (to all 5 IXPs) ASNs are only 11.
  • TOP-IX has the lowest number of connections: DE-CIX has 1,060 connections, AMS-IX counts 875, MIX 345 and NaMeX 162.

The “type” distribution in (Figure 5), shows how most Italian Internet Exchanges have a combination of the Cable / DSL / ISP type, while AMS-IX and DE-CIX are more oriented on the NSP type.

Figure 5 — IXPs comparison using “type” field.

By comparing the connections it is possible to identify (as a kind of evidence-based suggestion)new entities to be interconnected.

Figure 6 - Comparison between the ASNs connected to the IXPs in analysis.

HOW DOES THE NETWORK OF CONNECTIONS DEVELOP FROM THE GEOGRAPHICAL POINT OF VIEW?

At European level, the cities with the highest number of physical interchange points (“locations”) can be listed as follow :

  1. Frankfurt (88)
  2. Amsterdam (78)
  3. Moscow (56)
  4. St. Petersburg (45)
  5. London (36)

On the other hand the list of IXPs with the highest number of locations sees in first place NetIX with 139 locations, followed by DATAIX (124), NL-ix (80), AMS-IX (25) and DE-CIX Frankfurt (22).

Figure 7 - Map of the locations for the top five IXPs with the highest number of locations.

WHAT IS THE ITALIAN SITUATION?

Currently, in Italy, there are 19 locations divided between three reference IXPs:

  1. MIX (9)
  2. TOP-IX (7)
  3. NaMeX (3)

(*) Even in this case data are based on PeeringDB public information, for instances TOP-IX “real” locations are more than 20.

These locations are mainly concentrated in Lombardia, but they are also located in Piemonte, Lazio and Sicilia.

Table 2 - Showing the Italian locations per IXP.

HOW TO ENHANCE THE ANALYSIS

In the analytical phase, several datasets were created by merging the data obtained from the APIs and subsequently analyzed with specific Data Science tools such as Jupyter Notebook and Google Colaboratory.

The final step of the research work explored an automatic process to generate the data frames by importing them from a database created using PhpMyAdmin to engineer the future queries.

A web dashboard to query participants and Internet Exchanges was developed as well as new APIs, using CSS, HTML and JavaScript for the front-end, and Python language and the micro-framework Flask for the back-end (see Figure 8, 9 and 10 ).

Figure 8 - The homepage of the dashboard with one of its interactive sections which allow comparing IXPs and ASNs.
Figure 9 - The ASNs page of the dashboard with its comparative graphs. Values in the charts change according to what is selected in the first table.
Figure 10 - The IXPs page. It consists of a list of IXPs and an interactive map of the locations connected to the selection table.

The research project was developed from February to June 2021 thanks to the collaboration with the ENGIM “IFTS-Database design and management techniques 2020/21” course and the TOP-IX team.

The work was divided into two phases.

  1. The first phase took place during the schoolwork alternation period (February-April). The group of students, consisting of five elements, focused on data exploration and related analysis, supported by the TOP-IX mentors of the course.
  2. The second phase, developed during the full-time internship period in the company (May-June), focused instead on data engineering and resulted in prototyping the above web dashboard.

The working group

ENGIM Team: Veronica Cantoro, Nathan Essofack, Simona Pandinu.

(Luca Santoro, Gabriel Pignatiello also participated in the first schoolwork alternation period)

TOP-IX Team: Stefania Delprete (Data Scientist), Christian Racca (Senior Engineer), Andrea Beccaris (Full Stack Developer), Massimo Santoli (Developer) and Laura Pippinato (Designer).

--

--

TOP-IX Craftsman LAB
topixlab

A collection of unofficial pubblications and projects made by TOP-IX Team