Notice: The original article come from https://blog.mygraphql.com/en/posts/low-tec/trace/trace-istio/trace-istio-part4/ . If pictures are unclear, please redirect to the original article.
There is a Chinese version too.
Why
I can’t believe that I really insisted on writing Part 4. Believe it or not, the “Why” section of each part is the hardest thing to write :) . If you’re reading this series for the first time, don’t worry, each section is relatively independent.
Most likely you have sawed Envoy’s features from other places looks like this:
- Written in C++, native to the lower level, no ‘GC stop the world’, so excellent performance
- Asynchronous event-driven, multiplexing, perfect solution to C10k problem
- Because a single thread is responsible for multiple connections, the memory overhead of a large number of threads and the overhead of CPU context switching is reduced when large connections.
These descriptions, of course, have their reasonableness. But a lot of things that are beautiful from a distance, after macro magnification, may have a lot of interesting, valuable things. I believe that if we deep dive, it is always possible to make some meaningful optimizations for our actual operating environment and traffic characteristics. It may just be a modification of the configuration of an Envoy/Kernel, or it may be a modification of a line of Envoy’s code. Or your app’s behavior, such as the size of the buffer each time the socket is written.
All require to be based on understanding the implementation details. Unless you feel lucky or experiences are particularly good that you can guess.
[BPF tracing Istio/Envoy] series
Make a preview before you begin, [BPF tracing Istio/Envoy] series (will be) include:
- Part 1: Getting Started (Chinese)
- Part 2: Booting, listening, and load-balance of threads(Chinese)
- Part 3: Downstream connection accept, TLS handshake, and filter chain selection(Chinese)
- Part 4: Upstream/Downstream Event-Driven Collaboration of Envoy@Istio (This article)
- Part 5: L3/4 Network Fitler interaction
- Part 6: HTTP filter
- Part 7: HTTP router
- Part 8: cluster/connection pool and outbound load balance
In this series, I’ll show you how to use bpftrace
to "read" the object data in the memory of Envoy process which is written in C++11 on runtime. In order not to scare people away, I try show more pictures and less code. But some diagrams are a little complicated. Uncle Programmer began to tell stories. 🚜
High-level process of HTTP reverse proxy
The overall process of socket event-driven HTTP reverse proxy:
As you can see in the diagram, there are 4 types of events that drive the entire process. The next few sections are analyzed one by one.
To avoid getting lost in the details of each step at once, let’s take a look at the overall flow of all the steps:
Downstream Read Request collaboration
Explain the process in high-level:
- Downstream socket
readable
callback. Http::ConnectionManagerImpl
reads data from downstreamthe socket, incrementally put intoHttp1::ConnectionImpl
.Http1::ConnectionImpl
callsnghttp2
incrementally interprets http requests.- If
nghttp2
believes that the HTTP Request request has been read completely, it callsHttp::ServerConnection::onMessageCompleteBase()
. Http::ServerConnection::onMessageCompleteBase()
Stop downstream ReadReady listening.Http::ServerConnection::onMessageCompleteBase()
callsHttp::FilterManager
to initiate thedecodeHeaders
iteration ofhttp filter chain
.- In general, the last http filter of
http filter chain
isRouter::Filter
, and finally,Router::Filter::decodeHeaders()
is called. - The logic of
Router::Filter::decodeHeaders()
will shown in next figure.
Explain the process:
Router::Filter
,Router::Filter::d ecodeHeaders()
is called.- Select a cluster according to the configured Router rules.
- If the Cluster connection pool object does not exist, create a new one.
- Create a new
Envoy::Router::UpstreamRequest
object. - Call
Envoy::Router::UpstreamRequest::encodeHeaders(bool end_stream)
to encode HTTP header. - After a series of load balancing algorithms, match to the upstream host (endpoint).
- If no available connection to selected upstream host from connection pool, then:
- Open a new socket fd (not connected).
- Register the
WriteReady
/Connected
event for the upstream socket FD. Prepare to write an upstream request when the event callback occurs. - Initiate an asynchronous connection to upstream host with socket fd.
- Associate
downstream
andupstream fd
Upstream Write Request Collaboration
Explain the process:
- The first time upstream socket write ready callback.
- Detect the callback event type is
successful connection
, then associating the upstream socket toConnectionPool::ActiveClient
. - Second time upstream socket write ready callback.
- Detect the callback event type is
writable to the connection
, then write the upstream HTTP request.
Upstream Read Response Collaboration
Downstream Write Response Collaboration
bpftrace output
Above figures are not only according to the source code, but also the output of the bpftrace script and tracepoint. The principle of the bpftrace script is:
- Record downstream FD (file descriptor of socket), which can be thought of as the socket id in the process.
- Add kernel
tracepoint
and applicationuprobe
. Record the input, output and stack of the probe. - Associate downstream FD and upstream FD.
Of course, there are many details, but I am not going to talk about them one by one. Who wants to know more can contact me to discuss. The bpftrace script in the next section is more detail.
bpftrace script
End
This part, from a socket event-driven perspective, study the main process of Envoy as a reverse proxy. I think I learned something, How about you?
This is an old photo taken a few years ago.