15 KiB
Table of Contents
- Configuration
- Walkthrough
Description
This example displays how to configure a circuit breaking
in Istio, as well on how to test it and trigger the limits sets.
Based on
This example is based on the OFFICIAL Istio documentation example regarding circuit breaking.
Fortio
aka. Fortio
allows you to send traffic requests meanwhile being allowed to have some degree of control over them.
This is useful as we will try to reach/trigger the limits set in the DestinationRule configuration.
Configuration
Service
The service will forward incoming traffic from the service port 8080
, that will be forwarded towards the port 80
from the deployment, which contains an HTTP
service.
apiVersion: v1
kind: Service
metadata:
name: helloworld
labels:
app: helloworld
service: helloworld
spec:
ports:
- port: 8080
name: http
targetPort: 80
protocol: TCP
appProtocol: http
selector:
app: helloworld
Deployment
The deployment listen to the port 80
and 443
, hosting an HTTP
and HTTPS
service respectively to the aforementioned ports.
Note:
For more information about the image used refer to here
apiVersion: apps/v1
kind: Deployment
metadata:
name: helloworld
labels:
app: helloworld
spec:
replicas: 1
selector:
matchLabels:
app: helloworld
template:
metadata:
labels:
app: helloworld
spec:
containers:
- name: helloworld
image: oriolfilter/https-nginx-demo
resources:
requests:
cpu: "100m"
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
- containerPort: 443
Destination rule
This destination rule configures and sets limits to the traffic with host destination helloworld.default.svc.cluster.local
.
-
connectionPool.tcp.maxConnections
being set to 1, limits the amount of simultaneous maximum number of connections to 1. -
connectionPool.http.http1MaxPendingRequests
: Number of queued requests. -
connectionPool.http.maxRequestsPerConnection
: Limits the amount of connections to the backend by source of the request, if 1 is set in this field (which is our scenario), it disables the keep alive configuration. -
outlierDetection.consecutive5xxErrors
: Number of status codes5XX
required before a host is ejected from the connection pool. -
outlierDetection.interval
: Time between each analysis. -
outlierDetection.baseEjectionTime
: Minimum of time that a host is ejected from the connection pool. -
outlierDetection.maxEjectionPercent
: Maximum of hosts available to be ejected, as we set it to 100%, and as well we have only 1 deployment, whenever this rule is required to be triggered, it will allow the trigger to proceed to remove the host from the connection pool, finally resulting in all the hosts to be ejected.
Note:/ For more information regarding
DestinationRules
and their configuration fields, reffer to the following official Istio documentation regardingDestinationRules
.
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: helloworld.default.svc.cluster.local
spec:
host: helloworld.default.svc.cluster.local
trafficPolicy:
connectionPool:
tcp:
maxConnections: 1
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
outlierDetection:
consecutive5xxErrors: 1
interval: 1s
baseEjectionTime: 3m
maxEjectionPercent: 100
Walkthrough
Deploy resources
kubectl apply -f ./
service/helloworld created
deployment.apps/helloworld created
service/fortio created
deployment.apps/fortio-deploy created
destinationrule.networking.istio.io/helloworld.default.svc.cluster.local created
Test deployments
helloworld.default.svc.cluster.local
Check connectivity from fortio
to helloworld
We will use the package /usr/bin/fortio
to send a curl
request towards the helloworld
service deployment.
If it doesn't work, ensure that the URL And IP are the right ones, as well if there isn't a AuthorizationPolicy
that limits the traffic (also ensure that the deployments are ready kubectl get deployments -w -n default -owide
).
kubectl exec -n default "$(kubectl get pod -n default -l app=fortio -o jsonpath={.items..metadata.name})" -- /usr/bin/fortio curl -quiet helloworld.default.svc.cluster.local:8080
HTTP/1.1 200 OK
server: envoy
date: Sat, 29 Apr 2023 04:12:37 GMT
content-type: text/html
content-length: 15
last-modified: Tue, 25 Apr 2023 00:47:17 GMT
etag: "64472315-f"
strict-transport-security: max-age=7200
accept-ranges: bytes
x-envoy-upstream-service-time: 100
<h2>Howdy</h2>
Perform a stress test of the resources deployed
Through the Fortio
container, we will execute the following command fortio load -c 2 -qps 0 -n 20
:
fortio
: The package used.load
: It's used to gather statistics.-c 2
: Number of simultaneous connections.-qps 0
: Queries per second, if it's set to0
, means that there is no limit and will try to send it as fast/maximum as possible.-n 20
: Send 20 queries.
Note:
For more information regarding the available possible command configurations, refer to the respective Fortio documentation on Github
kubectl exec -n default "$(kubectl get pod -n default -l app=fortio -o jsonpath={.items..metadata.name})" -c fortio -- /usr/bin/fortio load -c 2 -qps 0 -n 20 -loglevel Warning helloworld.default.svc.cluster.local:8080
04:16:28 I logger.go:183> Log level is now 3 Warning (was 2 Info)
Fortio 1.54.2 running at 0 queries per second, 8->8 procs, for 20 calls: helloworld.default.svc.cluster.local:8080
04:16:28 W http_client.go:170> Assuming http:// on missing scheme for 'helloworld.default.svc.cluster.local:8080'
Starting at max qps with 2 thread(s) [gomax 8] for exactly 20 calls (10 per thread + 0)
04:16:28 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
04:16:28 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
04:16:28 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
04:16:28 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
04:16:28 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
04:16:28 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
04:16:28 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
04:16:28 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
Ended after 259.87612ms : 20 calls. qps=76.96
Aggregated Function Time : count 20 avg 0.02460594 +/- 0.02677 min 0.0010981 max 0.090700113 sum 0.492118798
# range, mid point, percentile, count
>= 0.0010981 <= 0.002 , 0.00154905 , 15.00, 3
> 0.002 <= 0.003 , 0.0025 , 20.00, 1
> 0.003 <= 0.004 , 0.0035 , 25.00, 1
> 0.005 <= 0.006 , 0.0055 , 30.00, 1
> 0.007 <= 0.008 , 0.0075 , 40.00, 2
> 0.011 <= 0.012 , 0.0115 , 45.00, 1
> 0.012 <= 0.014 , 0.013 , 55.00, 2
> 0.016 <= 0.018 , 0.017 , 60.00, 1
> 0.018 <= 0.02 , 0.019 , 65.00, 1
> 0.025 <= 0.03 , 0.0275 , 75.00, 2
> 0.03 <= 0.035 , 0.0325 , 80.00, 1
> 0.05 <= 0.06 , 0.055 , 85.00, 1
> 0.07 <= 0.08 , 0.075 , 95.00, 2
> 0.09 <= 0.0907001 , 0.0903501 , 100.00, 1
# target 50% 0.013
# target 75% 0.03
# target 90% 0.075
# target 99% 0.0905601
# target 99.9% 0.0906861
Error cases : count 8 avg 0.012249498 +/- 0.01797 min 0.0010981 max 0.054279663 sum 0.097995986
# range, mid point, percentile, count
>= 0.0010981 <= 0.002 , 0.00154905 , 37.50, 3
> 0.002 <= 0.003 , 0.0025 , 50.00, 1
> 0.003 <= 0.004 , 0.0035 , 62.50, 1
> 0.005 <= 0.006 , 0.0055 , 75.00, 1
> 0.025 <= 0.03 , 0.0275 , 87.50, 1
> 0.05 <= 0.0542797 , 0.0521398 , 100.00, 1
# target 50% 0.003
# target 75% 0.006
# target 90% 0.0508559
# target 99% 0.0539373
# target 99.9% 0.0542454
# Socket and IP used for each connection:
[0] 6 socket used, resolved to 10.99.49.188:8080, connection timing : count 6 avg 0.00037803967 +/- 0.0001574 min 0.000231869 max 0.00069415 sum 0.002268238
[1] 4 socket used, resolved to 10.99.49.188:8080, connection timing : count 4 avg 0.00045579175 +/- 9.155e-05 min 0.0003847 max 0.000612777 sum 0.001823167
Connection time (s) : count 10 avg 0.0004091405 +/- 0.0001403 min 0.000231869 max 0.00069415 sum 0.004091405
Sockets used: 10 (for perfect keepalive, would be 2)
Uniform: false, Jitter: false, Catchup allowed: true
IP addresses distribution:
10.99.49.188:8080: 10
Code 200 : 12 (60.0 %)
Code 503 : 8 (40.0 %)
Response Header Sizes : count 20 avg 167.7 +/- 136.9 min 0 max 280 sum 3354
Response Body/Total Sizes : count 20 avg 273.1 +/- 26.21 min 241 max 295 sum 5462
All done 20 calls (plus 0 warmup) 24.606 ms avg, 77.0 qps
Information to highlight
From the output received, I would like to focus in the following entry, which states that 60% of the traffic was successful (returning a status code 200
), meanwhile 40% failed (returning status code 503
).
Code 200 : 12 (60.0 %)
Code 503 : 8 (40.0 %)
Check Fortio istio-proxy logs (pilot-agent request GET stats
)
Check (Fortio's app) istio-proxy logs
kubectl exec "$(kubectl get pod -n default -l app=fortio -o jsonpath={.items..metadata.name})" -c istio-proxy -- pilot-agent request GET stats | grep helloworld | grep pending
cluster.outbound|8080||helloworld.default.svc.cluster.local.circuit_breakers.default.remaining_pending: 1
cluster.outbound|8080||helloworld.default.svc.cluster.local.circuit_breakers.default.rq_pending_open: 0
cluster.outbound|8080||helloworld.default.svc.cluster.local.circuit_breakers.high.rq_pending_open: 0
cluster.outbound|8080||helloworld.default.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|8080||helloworld.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0
cluster.outbound|8080||helloworld.default.svc.cluster.local.upstream_rq_pending_overflow: 3
cluster.outbound|8080||helloworld.default.svc.cluster.local.upstream_rq_pending_total: 18
If we review the field upstream_rq_pending_overflow
, where it states that the value is set to 3
, it means that 3 entries where flagged for circuit breaking.
helloworld.default
Test destination URL helloworld.default
Same procedure as the step Perform a stress test of the resources deployed, but instead of using the destination URL helloworld.default.svc.cluster.local
, we will be using the URL helloworld.default
to confirm if the destination rule is still being applied even if the full URL doesn't match.
kubectl exec -n default "$(kubectl get pod -n default -l app=fortio -o jsonpath={.items..metadata.name})" -c fortio -- /usr/bin/fortio load -c 2 -qps 0 -n 20 -loglevel Warning helloworld.default:8080
20:01:10 I logger.go:183> Log level is now 3 Warning (was 2 Info)
Fortio 1.54.2 running at 0 queries per second, 8->8 procs, for 20 calls: helloworld.default:8080
20:01:10 W http_client.go:170> Assuming http:// on missing scheme for 'helloworld.default:8080'
Starting at max qps with 2 thread(s) [gomax 8] for exactly 20 calls (10 per thread + 0)
20:01:10 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [1] Non ok http code 503 (HTTP/1.1 503)
20:01:10 W http_client.go:1058> [0] Non ok http code 503 (HTTP/1.1 503)
Ended after 162.18683ms : 20 calls. qps=123.31
Aggregated Function Time : count 20 avg 0.014704745 +/- 0.01561 min 0.001096933 max 0.044131073 sum 0.294094897
# range, mid point, percentile, count
>= 0.00109693 <= 0.002 , 0.00154847 , 30.00, 6
> 0.003 <= 0.004 , 0.0035 , 35.00, 1
> 0.004 <= 0.005 , 0.0045 , 45.00, 2
> 0.005 <= 0.006 , 0.0055 , 50.00, 1
> 0.006 <= 0.007 , 0.0065 , 55.00, 1
> 0.011 <= 0.012 , 0.0115 , 60.00, 1
> 0.014 <= 0.016 , 0.015 , 65.00, 1
> 0.016 <= 0.018 , 0.017 , 70.00, 1
> 0.018 <= 0.02 , 0.019 , 75.00, 1
> 0.035 <= 0.04 , 0.0375 , 90.00, 3
> 0.04 <= 0.0441311 , 0.0420655 , 100.00, 2
# target 50% 0.006
# target 75% 0.02
# target 90% 0.04
# target 99% 0.043718
# target 99.9% 0.0440898
Error cases : count 11 avg 0.0029070851 +/- 0.001827 min 0.001096933 max 0.006039404 sum 0.031977936
# range, mid point, percentile, count
>= 0.00109693 <= 0.002 , 0.00154847 , 54.55, 6
> 0.003 <= 0.004 , 0.0035 , 63.64, 1
> 0.004 <= 0.005 , 0.0045 , 81.82, 2
> 0.005 <= 0.006 , 0.0055 , 90.91, 1
> 0.006 <= 0.0060394 , 0.0060197 , 100.00, 1
# target 50% 0.00190969
# target 75% 0.004625
# target 90% 0.0059
# target 99% 0.00603507
# target 99.9% 0.00603897
# Socket and IP used for each connection:
[0] 6 socket used, resolved to 10.98.152.137:8080, connection timing : count 6 avg 0.00047558633 +/- 0.0001364 min 0.000294869 max 0.000739941 sum 0.002853518
[1] 7 socket used, resolved to 10.98.152.137:8080, connection timing : count 7 avg 0.000457311 +/- 0.0001076 min 0.000320826 max 0.000596445 sum 0.003201177
Connection time (s) : count 13 avg 0.00046574577 +/- 0.0001221 min 0.000294869 max 0.000739941 sum 0.006054695
Sockets used: 13 (for perfect keepalive, would be 2)
Uniform: false, Jitter: false, Catchup allowed: true
IP addresses distribution:
10.98.152.137:8080: 13
Code 200 : 9 (45.0 %)
Code 503 : 11 (55.0 %)
Response Header Sizes : count 20 avg 125.9 +/- 139.2 min 0 max 280 sum 2518
Response Body/Total Sizes : count 20 avg 265.2 +/- 26.76 min 241 max 295 sum 5304
All done 20 calls (plus 0 warmup) 14.705 ms avg, 123.3 qps
As we can see, the rules are still being applied, this time resulting in a 45% of the traffic receiving a successful status code (200
), meanwhile a 55% received a failure status code (503
).
This confirms that, even if the full URL is not the same as the configured in the DestinationRule, the DestinationRule is still being enforced.
Cleanup
kubectl delete -f ./
deployment.apps "helloworld" deleted
destinationrule.networking.istio.io "helloworld.default.svc.cluster.local" deleted
service "fortio" deleted
deployment.apps "fortio-deploy" deleted
service "helloworld" deleted