Using Istio to Manage Retry and Timeout for External Calls Over TLS

Istio helps us to set timeout and retry when the system calls an external API without coding or changing the existing system.

Hamed Abdollahpour
Trendyol Tech
4 min readSep 12, 2022

--

Istio

In this article, I’ll show you how you can manage traffic in the Istio service mesh and apply timeout and retry patterns to the API calls. These services could be external services on an encrypted connection using TLS.

Why do we need a timeout and retrying?

In the world of distributed systems, managing retry is an important pattern for recovering from transient errors. Most of these errors are self-healing errors. For example, network failure errors or service overloading errors. Also, dealing with timeouts is similar. Any service that we use, either internally or externally, could go into an unknown state and fail to respond, or they become very slow and eventually cause a timeout in a chain of calls that request is going through.
If our system has some external dependencies, like third-party APIs it would be even more important to think about them. Because those services are not even in our control, we won’t have any monitoring on them.

How access to an API on TLS is different than plain HTTP?

Although you can use TLS in your services, and using Istio (envoy proxy) to just route it to the dectination, but you’ll lose many powerful features of Istio. Because when you pass an encrypted message via proxy, that proxy does not have any access to the content of the HTTP request or response. Proxy cannot open up the content and apply any changes to it. But using HTTP is not an option as well, because of many security concernts. But one solution could be you initiate TLS on envoy proxy rather than your service. You communicate with Envoy proxy on plain HTTP and envoy do all the TLS related encryptions for you.

But does it bring any vulnerabilities?
The answer is, no. Because containers (your service and the proxy) are inside a Pod and communicate using the loopback network interface. It is implemented entirely within the operating system’s networking software and passes no packets to any network interface controller. Your service and the envoy proxy communicate in this isolated network, and from outside you are still on TLS. The only difference is that TLS initialization and termination happen on Envoy proxy rather than your service.

Setup external access on Istio

Let’s set up this model. The first step would be defining a service entry. We use a service entry to add an entry to the service registry that Istio maintains internally. Configuring service entries allows you to manage traffic for services running outside of the mesh, just like it’s inside of your mesh. Like one of your services. Here is an example:

Now we set up Istio to open up access to this external service on ports 80 and 433 for TLS.

Now we want to define some policies and tell to Istio that route every traffic on port 443 using TLS to the target domain.

OK, now everything is in place to play with the defined external service as part of your mesh. You can define a VirtualService to manage retry and timeout on the target host. For example:

So, it means every request that sends on port 80 for host your-public-domain.com, gets encrypted and forwards on port 443 using TLS and send out.

The request gets retried 3 times and each time 10 seconds. You can define on what condition we do the retrying using retryOn. Now that we have our external service as part of the mesh, we can do much more using Istio traffic management solutions. Check out Istio official website for more.

--

--

Hamed Abdollahpour
Trendyol Tech

Engineer. Creator. Leader. | Building: Anything in the digital world | Passion: Microservices | Writing: Software patterns, best practices, Microservices.