Http traffic visualization

Sefa Pehlivan
hepsiburadatech
Published in
4 min readJun 20, 2022

Türkçe versiyon için tıklayın.

As everyone knows, we often use metrics (Grafana, Prometheus) and logs (Elasticsearch) to understand which domains our applications are accessing.
These solutions are useful to understand what is happening in the network, but they are not enough to see the whole and extract all dependencies.
Considering our needs, we visualized HTTP traffic and I will be sharing it with you in this article.

Log flow schema

To summarize the flow in the image, each Envoy in our environment continuously transmits all traffic logs to the application over the GRPC stream to a log collector application we have written.
The transmitted logs are sent to Kafka topic and then to Elasticsearch via Kafka-connect.

BigBang-ALS

  • Accepting logs from Envoy via GRPC-stream.
  • Extract the fields we selected from the incoming logs.
  • Asking for the source IP address to the Valse application to find out which K8S it belongs to.
  • Finally, forward these logs to the Kafka topic.

Kafka

Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.

ElasticSearch

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.

Babba

  • It acts as a bridge between Elasticsearch and the Frontend.
  • It sends search requests to Elasticsearch according to the filters in the post requests from the frontend.
  • It formats the answers returned from Elasticsearch according to VisJS.

Log Collector & Producer

We defined the BigBang-ALS application to Envoy as an ALS cluster.

The 2nd phase of this project is to detect dependencies at the application level, for this we will need to write the application name in a common header to the HTTP requests made from each application. Then with this header, we will also be able to group source IP addresses and extract dependencies at the application level.

Kafka & Elastic

Every log is precious to us. In an environment where 25k logs flow per second, we used Kafka to not lose any logs and to accumulate logs in possible failure scenarios.
In addition, we transmit the logs we collect on the Kafka topic to our Elasticsearch via the Kafka Elasticsearch sink connector.

{"connector.class":"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector","type.name":"_doc","topics":"als","transforms":"ExtractTimestamp","linger.ms": 2,"max.in.flight.requests": 15,"batch.size": 350,"key.ignore":true,"max.buffered.records": 5250,"schema.ignore":true,"transforms.ExtractTimestamp.type":"org.apache.kafka.connect.transforms.InsertField$Value","value.converter.schemas.enable":false,"connection.url":"https://1.1.1.1:9200,https://1.1.1.2:9200,https://1.1.1.3:9200,https://1.1.1.4:9200,https://1.1.1.5:9200,https://1.1.1.6:9200","connection.username": "********","connection.password": "********","elastic.security.protocol": "SSL","elastic.https.ssl.protocol": "TLS","elastic.https.ssl.keystore.location": "/mnt/kafka/kafka/config/keystore/keystore.jks","elastic.https.ssl.keystore.password": "********","elastic.https.ssl.key.password": "********","elastic.https.ssl.keystore.type": "JKS","elastic.https.ssl.truststore.location": "/mnt/kafka/kafka/config/truststore/truststore.jks","elastic.https.ssl.truststore.password": "********","elastic.https.ssl.truststore.type": "JKS","value.converter":"org.apache.kafka.connect.json.JsonConverter","key.converter":"org.apache.kafka.connect.json.JsonConverter","transforms.ExtractTimestamp.timestamp.field":"req_time"}

Preparing the Data

In the process so far, we have collected, extracted, and stored the data. To feed the javascript library we use on the frontend, we wrote an application that we positioned between Elasticsearch and the frontend.
According to the filters requested from the frontend, we collect the data from the instant elk and return it to the data structure requested by the frontend.

React & VisJs

HTTP Traffic Visualization (by Group)
Filtered Visualization

We love open-source solutions. VisJs is one of them, thanks to the community that developed it. We visualize the incoming data using the VisJs library with React on the frontend.
There are several physics algorithms in VisJs, we had to do a lot of experiments to find the best algorithm according to the number of nodes and the number of dependencies. If an error of more than 5 percent is received within the selected time interval, edge colors are made in red. I share the values we used below.

{
height: '100%',
width: '100%',
autoResize: false,
layout: {
improvedLayout: false,
},
groups: {
domain_green: {
icon: {
face: "FontAwesome",
code: "\uf0ac",
color: "green",
size: 30
},
font: {
size: 18,
color: "green",
face: "courier",
strokeWidth: 3,
strokeColor: "#ffffff"
}
},
source: {
icon: {
face: "FontAwesome",
code: "\uf233",
color: "black",
size: 30
},
font: {
size: 18,
color: "black",
face: "courier",
strokeWidth: 3,
strokeColor: "#ffffff"
}
},
domain_red: {
icon: {
face: "FontAwesome",
code: "\uf0ac",
color: "red",
size: 40
},
font: {
size: 18,
color: "red",
face: "courier",
strokeWidth: 3,
strokeColor: "#ffffff"
}
}
},
physics: {
enabled: true,
timestep: 0.5,
maxVelocity: 10,
minVelocity: 4,
solver: "forceAtlas2Based",
forceAtlas2Based: {
theta: 0.5,
gravitationalConstant: -190,
centralGravity: 0.003,
springConstant: 0.09,
springLength: 340,
avoidOverlap: 1,
damping: 0.4,
},
stabilization: {
enabled: true,
fit: true,
iterations: 1000,
updateInterval: 25
},
},
nodes: {
shadow: {
enabled: true,
},
mass: 1,
color: "#7CFC00",
shape: 'icon',
icon: {
size: 30,
},
},
interaction: {
hideEdgesOnDrag: true,
hoverConnectedEdges: true,
hideEdgesOnZoom: true,
hover: true,
navigationButtons: true,
zoomSpeed: 0.37,
tooltipDelay: 100,
},
edges: {
hoverWidth: 5,
selectionWidth: 5,
selfReference: {
renderBehindTheNode: true,
angle: 22,
size: 130,
},
smooth: {
enabled: true,
type: "continuous",
roundness: 0.5,
}
}
};
HTTP Traffic Visualization (by Single IP)

Conclusion

At the end of the day, we have created a solution where we can see the network structure and the applications that generate errors and are affected in case of problems.
Thank you for your time and see you in the next article.

--

--