Traffic Management for Knative Services

Zhimin Wen
Dec 15, 2019 · 5 min read

With the Knative Service CRD object, we can manage the traffic splitting among the service revisions, and therefore achieve the different deployment and release strategies such as canary deployment, blue and green deployment easily. However, we may still meet some specific requirements for traffic management such as A/B testing based on HTTP header, URL or query parameters.

This paper examines the different options for traffic management including the default feature enabled by Knative and some special requirements that can be achieved with Istio.

The experiment in this paper is based on Knative 0.6, Istion 1.1.7 on top of OpenShift 3.11.

Traffic splitting with Knative

Using a Golang hello world application, create the following Knative service,

apiVersion: serving.knative.dev/v1alpha1
kind: Service
metadata:
name: app
namespace: traffic
spec:
template:
metadata:
name: app-v1
spec:
containers:
- image: docker-registry.default.svc:5000/traffic/traffic
env:
- name: APP_VERSION
value: V1
traffic:
- tag: one
revisionName: app-v1
percent: 100

In this Knative service definition, notice the template.metadata.name field. Once we define a name prefixed with the service name, it will be used as the revision’s name for Kanative service. I also defined traffic.tag name.

After that, in the OpenShift environment, the following routes are created by the Knative operator.

oc get route -n istio-system | grep traffic | awk '{print $2}'
app.traffic.apps.ocp.fyre.io.cpak
app-one.traffic.apps.ocp.fyre.io.cpak
one.app.traffic.apps.ocp.fyre.io.cpak

The route hostname will be in the following pattern.

  • {{ .appName }}.{{ .namespace }}.{{ .ocpDomainName }}
  • {{ .appName }}-{{ .tagName }}.{{ .namespace }}.{{ .ocpDomainName }}
  • {{ .tagName }}.{{ .appName }}.{{ .namespace }}.{{ .ocpDomainName }}

The first route is for the public Knative service URL address where the traffic is managed by the “traffic” stanza in the Knative service definition. While the last two route is the private route where the traffic will be directed to the specific revision Pod.

Assume we want to do a canary deployment for the v2 app, the Knative service can be updated with the below YAML,

apiVersion: serving.knative.dev/v1alpha1
kind: Service
metadata:
name: app
namespace: traffic
spec:
template:
metadata:
name: app-v2
spec:
containers:
- image: docker-registry.default.svc:5000/traffic/traffic
env:
- name: APP_VERSION
value: V2
traffic:
- tag: one
revisionName: app-v1
percent: 80
- tag: two
revisionName: app-v2
percent: 20

The revision is named as app-v2, and tagged as “two”. If we access the public URL, the traffic is split into revision 1 (app-v1) 80% and revision 2 (app-v2) 20% accordingly. In the meantime, we can access the revision directly through their tag name URL, such as two.app.traffic.apps.ocp.fyre.io.cpak

Route traffic based on HTTP headers

Now let’s say we want to do an A/B test for some target user based on the HTTP header. Though Knative doesn’t provide this feature, we can use Istio traffic management capability to achieve it.

  1. Istio Virtual Service

Create an Istio VirtualService, where we filter based on the HTTP headers. If the rule matches, then rewrite the Authority/Host field, re-route the traffic back to the Istio ingress gateway and let it redirect again to the right Pods.

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: 00-traffic-vs
namespace: knative-serving
spec:
gateways:
- knative-ingress-gateway
- mesh
hosts:
- pilot.traffic.apps.ocp.fyre.io.cpak
http:
- name: pilot-user
match:
- headers:
user:
exact: pilot
authority:
regex: ^pilot\.traffic\.apps\.ocp\.fyre\.io\.cpak$
rewrite:
authority: two.app.traffic.apps.ocp.fyre.io.cpak
retries:
attempts: 3
perTryTimeout: 10m0s
route:
- destination:
host: istio-ingressgateway.istio-system.svc.cluster.local
port:
number: 80
weight: 100
timeout: 10m0s
websocketUpgrade: true
- name: non-pilot-user
match:
- authority:
regex: ^pilot\.traffic\.apps\.ocp\.fyre\.io\.cpak$
rewrite:
authority: one.app.traffic.apps.ocp.fyre.io.cpak
retries:
attempts: 3
perTryTimeout: 10m0s
route:
- destination:
host: istio-ingressgateway.istio-system.svc.cluster.local
port:
number: 80
weight: 100
timeout: 10m0s
websocketUpgrade: true

The host field in the HTTP header for this virtual service needs to match pilot.traffic.apps.ocp.fyre.io.cpak We then define two HTTP Route rules. The first is named as “pilot-user” where the match criteria are set as below two.

  1. HTTP header must have the key-value pair of user=pilot
  2. HTTP authority/host field must match ^pilot\.traffic\.apps\.ocp\.fyre\.io\.cpak$

If these two criteria are both matched (AND logic applies if the criteria are in a Dictionary/Hash format), we rewrite the header of Authority to the V2 of the app with the tag format, two.app.traffic.apps.ocp.fyre.io.cpak , then route back the traffic to the destination of Istio ingress gateway, istio-ingressgateway.istio-system.svc.cluster.local . Let it re-evaluate which pod the traffic should be forwarded to.

The 2nd routing rule, named as “non-pilot-user”, is a catch-all rule. if the first routing fule doesn’t meet, as long as the Host is matching then we redirect the traffic to V1 of the app.

Tips:

1. The virtual service has to be in the namespace where the gateway is defined. For Knative it has to be in the namespace of knative-serving

2. The Pod log of networking-istio in the namespace of knative-serving tracks the reconcile result of the definition. It is a good place to debug the VirtualService yaml syntax error.

2. OpenShift Route

As we are defining a new host in the VirtualService, OpenShift needs to be informed of it. Create a new OpenShift Route in the namespace of istio-system, linking the host of pilot.traffic.apps.ocp.fyre.io.cpak to the service at istio-ingressgateway

apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: traffic-app-pilot-route
namespace: istio-system
spec:
host: pilot.traffic.apps.ocp.fyre.io.cpak
port:
targetPort: 80
to:
kind: Service
name: istio-ingressgateway
weight: 100
wildcardPolicy: None
status:
ingress: []

Now test the result

curl http://pilot.traffic.apps.ocp.fyre.io.cpak
Hello World. V1

and if we use the specific header, the result shows V2.

curl -H "user: pilot" http://pilot.traffic.apps.ocp.fyre.io.cpak
Hello World. V2

Route traffic based on HTTP headers with the same Host

In the above example, we create the VirtualService using a different Host. Can we achieve the same A/B test requirement using the original Host created by the Knative service?

The answer is yes. Just follow the same idea as shown in the above section. Define the VirtualService,

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: 01-traffic-vs
namespace: knative-serving
spec:
gateways:
- knative-ingress-gateway
- mesh
hosts:
- app.traffic.apps.ocp.fyre.io.cpak
http:
- name: pilot-user
match:
- headers:
user:
exact: pilot
authority:
regex: ^app\.traffic\.(?:.*)$
rewrite:
authority: two.app.traffic.apps.ocp.fyre.io.cpak
retries:
attempts: 3
perTryTimeout: 10m0s
route:
- destination:
host: istio-ingressgateway.istio-system.svc.cluster.local
port:
number: 80
weight: 100
timeout: 10m0s
websocketUpgrade: true
- name: non-pilot-user
match:
- authority:
regex: ^app\.traffic\.(?:.*)$
rewrite:
authority: one.app.traffic.apps.ocp.fyre.io.cpak
retries:
attempts: 3
perTryTimeout: 10m0s
route:
- destination:
host: istio-ingressgateway.istio-system.svc.cluster.local
port:
number: 80
weight: 100
timeout: 10m0s
websocketUpgrade: true

But the VirtualService order is important. As the VirtualService created by Knative and our newly created VirtualService operates on the same Host in the definition, the earlier created one will take precedence.

In order for our rules to take precedence over the Knative one, we have to deploy our VirtualService before Knative. Redeploy the Knative service with the following order,

  1. Delete the Knative services
  2. Create our new VirtualService
  3. Re-create the Knative services.

Now we can test it again.

curl http://app.traffic.apps.ocp.fyre.io.cpak
Hello World. V1
curl -H "user: pilot" http://app.traffic.apps.ocp.fyre.io.cpak
Hello World. V2

The same techniques apply if we want to bring in the Istio traffic management capability to the Knative services. This includes HTTP URL, HTTP Headers, Query parameters and so on.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade