visit
Its data source is the Kubernetes Metrics API, which by the way also powers the kubectl top
command, and backed by data provided by the metrics-server
component. This component runs on your cluster and it is installed by default on GKE, AKS, CIVO and k3s clusters, but it needs to be manually installed on many others, like on Digital Ocean, EKS and Linode.
The HPA resource is moderately well documented in the Kubernetes documentation. Some confusion arises from the fact that there are blog posts out there showcasing different Kubernetes API versions: keep in mind that autoscaling/v2
is not backwards compatible with v1!
In KEDA, you create a ScaledObject
custom resource with the necessary information about the deployment you want to scale, then define the trigger event, which can be based on CPU and memory usage or on custom metrics. It has premade triggers for most anything that you may want to scale on, with a yaml structure that we think the Kubernetes API could have been made in the first place.
In order to autoscale your application with KEDA, you need to define a ScaledObject
resource.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: cpu-based-scaledobject
namespace: default
spec:
minReplicaCount: 1
maxReplicaCount: 10
scaleTargetRef:
kind: Deployment
name: test-app-deployment
triggers:
- type: cpu
metricType: Utilization
metadata:
value: "50"
scaleTargetRef
is where you refer to your deployment, and triggers
is where you define the metrics and threshold that will trigger the scaling.
In this sample we trigger based on the CPU usage, the ScaledObject
will manage the number of replicas automatically for you and maintain a maximum 50% CPU usage per pod.
As usual with Kubernetes custom resources, you can kubectl get
and kubectl describe
the resource once you deployed it on the cluster.
$ kubectl get scaledobject
NAME SCALETARGETKIND SCALETARGETNAME MIN MAX TRIGGERS READY ACTIVE
cpu-based-scaledobject apps/v1.Deployment test-app-deployment 2 10 cpu True True
To have an in-depth understanding of what is happening in the background, you can see the logs of the keda operator pod, and you can also kubectl describe
the HPA resource that KEDA created.
To use custom metrics, you need to make changes to the triggers
section.
triggers:
- type: prometheus
metadata:
serverAddress: //<prometheus-host>:9090
metricName: http_requests_total # Note: name to identify the metric, generated value would be `prometheus-http_requests_total`
query: sum(rate(http_requests_total{deployment="my-deployment"}[2m])) # Note: query must return a vector/scalar single element response
threshold: '100.50'
activationThreshold: '5.5'
triggers:
- type: rabbitmq
metadata:
host: amqp://localhost:5672/vhost
mode: QueueLength # QueueLength or MessageRate
value: "100" # message backlog or publish/sec. target per instance
queueName: testqueue
Originally by Youcef Guichi and Laszlo Fogas at .