visit
Horizontal Pod Autoscaler (HPA)Cluster Autoscaler (CA)
The Horizontal Pod Autoscaler or HPA is a Kubernetes component that automatically scales your service based on metrics such as CPU utilization or others, as defined through the Kubernetes metric server. The HPA scales the pods in either a deployment or replica set, and is implemented as a Kubernetes API resource and a controller. The Controller Manager queries the resource utilization against the metrics specified in each horizontal pod autoscaler definition. It obtains the metrics from either the resource metrics API for per pod metrics or the custom metrics API for any other metrics.
curl //raw.githubusercontent.com/kubernetes/helm/master/scripts/get > helm.sh
chmod +x helm.sh
./helm.sh
Now, we are going to set up the server base portion of Helm called Tiller. This requires a service account:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: tiller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: tiller
namespace: kube-system
kubectl apply -f tiller.yml
Run
helm init
using the Tiller service account we have just created:helm init --service-account tiller
With Helm installed, we can now deploy the metric server. Metric servers are cluster wide aggregators of resource usage data where metrics are collected by
kubelet
on each worker node, and are used to dictate the scaling behavior of deployments.So let's go ahead and install that now:helm install stable/metrics-server --name metrics-server --version 2.0.4 --namespace metrics
kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80
**requests=cpu=200m - requesting 200 millicores get allocated to pod
Now, let us autoscale our deployment:kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
kubectl get hpa
Review
Targets
column, if it says unknown/50%
then it means that the current CPU consumption is 0%, as we are not currently sending any request to the server. This will take a couple of minutes to show the correct value, so let us grab a cup of coffee and come back when we have got some data here.Rerun the last command and confirm that
Targets
column is now 0%/50%
. Now, let's generate some load in order to trigger scaling by running the following :kubectl run -i --tty load-generator --image=busybox /bin/sh
kubectl get hpa -w
eksctl create nodegroup --cluster <CLUSTER_NAME> --node-zones <REGION_CODE> --name <REGION_CODE> --asg-access --nodes-min 1 --nodes 5 --nodes-max 10 --managed
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"ec2:DescribeLaunchTemplateVersions"
],
"Resource": "*",
"Effect": "Allow"
}
]
}
wget //raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
kubectl apply -f cluster-autoscaler-autodiscover.yaml
Of course we should wait for the pods to finish creating. Once done, we can scale our cluster out. We will consider a simple
nginx
application with the following yaml
file:apiVersion: extensions/v1beta2
kind: Deployment
metadata:
name: nginx-scale
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 500m
memory: 512Mi
kubectl apply -f nginx.yaml
kubectl get deployment/nginx-scale
kubectl scale --replicas=10 deployment/nginx-scale
kubectl get pods -o wide --watch
Previously posted at .