Fork me on GitHub

poncellat's Blog coping data from brain to disk :D

Kubernetes | Autoscaling | Horizontal Pod Autoscaling

This article discusses Horizontal Pod Autoscaling (HPA), one of the types of autoscaling in Kubernetes.

What is Horizontal Autoscaling

Horizontal autoscaling is based on the load in CPU and memory, you can scale up or down the number of pods automatically so that your application can better serve during high or low traffic hours eventually customers can have a seamless experience.

For this purpose, you would configure the minimum and maximum number of pods to be created based on metrics like CPU and memory.

There is a concept called vertical autoscaling where you add nodes to increase the infrastructure capacity in terms of CPU, RAM, storage, etc.,

How in Kubernetes

Deploying sample application using Kubernetes Deployment and Service

Consider below deployment and service definition yaml file php-apache.yaml.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      containers:
      - name: php-apache
        image: registry.k8s.io/hpa-example
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
  - port: 80
  selector:
    run: php-apache

Assuming you already have a k8s cluster setup, to create deployment and service resources,

$ kubectl apply -f php-apache.yaml

This would create a deployment and service as below. From the deployment, the replicaset is created and from the replicaset, pods are created. so we have a service, deployment, replicaset, and pod.

$ kubectl get all
NAME                              READY   STATUS    RESTARTS   AGE
pod/php-apache-7495ff8f5b-gt8r8   1/1     Running   0          4m29s

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service/php-apache   ClusterIP   10.100.126.40   <none>        80/TCP    4m29s

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/php-apache   1/1     1            1           4m29s

NAME                                    DESIRED   CURRENT   READY   AGE
replicaset.apps/php-apache-7495ff8f5b   1         1         1       4m29s

To check if the sample application is working, as this service is of type Cluster-IP, we can either expose the service as NodePort so that it is accessible from the host machine, or we can ssh into the cluster and access via the Cluster-IP.

Testing Application Method 1 (optional)

Expose the service php-apache to a new service of type Nodeport so that it will be accessible from your local machine and outside of the Kubernetes cluster. You can now access the app from the local machine using the IP address of the cluster followed by the port exposed in NodePort.

$ kubectl expose service php-apache --port=80 --target-port=80 --name=php-apache-np --type=NodePort
service/php-apache-np exposed
$ kubectl get svc
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
php-apache      ClusterIP   10.100.126.40   <none>        80/TCP         29m
php-apache-np   NodePort    10.102.48.181   <none>        80:30708/TCP   4s
$ minikube ip
192.168.49.2
$ curl http://192.168.49.2:30708
OK!

Testing Application Method 2 (optional)

To SSH into the cluster and check the PHP application, the below example assumes the cluster is created using Minikube.

$ minikube ssh
docker@minikube:~$

To check the application, use curl with the app IP address. The app IP address can be found in Cluster-IP from the service by executing kubectl get svc php-apache.

$ minikube ssh
docker@minikube:~$ curl http://10.100.126.40
OK!

Create Metrics Server

Download the latest release of the metrics server.

curl -LO https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Add the lines hostNetwork: true and - --kubelet-insecure-tls in components.yaml.

  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      hostNetwork: true
      containers:
      - args:
        - --kubelet-insecure-tls
        - --cert-dir=/tmp
        - --secure-port=4443

Apply the components.yaml.

kubectl apply -f components.yaml

Check for the pods in the namespace kube-system.

$ kubectl get pod  -n kube-system
NAME                               READY   STATUS    RESTARTS      AGE
...
metrics-server-77d9b56856-qjxh7    1/1     Running   0             3m32s
...

Check if it's installed properly by,

$ kubectl top nodes
NAME       CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
minikube   389m         19%    901Mi           11%     

$ kubectl top pods
NAME                          CPU(cores)   MEMORY(bytes)   
php-apache-7495ff8f5b-gt8r8   1m           31Mi

If it is installed properly you will see the output about the CPU, and memory metrics as shown above.

Create the Horizontal Pod AutoScaler

Create a Horizontal Pod Autoscaler which would autoscale if the CPU usage goes beyond 50% with the minimum and maximum number of pods to be autoscaled.

$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

View the HPA resource,

$ kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         10        1          65m

As currently, the application has no load, the target is 0%. HPA gets this information from the metrics server installed previously.

Increase the Load

To increase the load we will be using the busybox image, which continuously accesses the application in a loop. In a new terminal,

$ kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

In the other terminal, after a while, you can see the target % is increased leading to autoscale out the deployment which affects the replicasets which in turn increases the number of pods to handle the load.

$ kubectl get all
NAME                              READY   STATUS    RESTARTS      AGE
pod/load-generator                1/1     Running   0             105s
pod/php-apache-7495ff8f5b-gfzhb   1/1     Running   0             68s
pod/php-apache-7495ff8f5b-gt8r8   1/1     Running   1 (72m ago)   4h
pod/php-apache-7495ff8f5b-j8fb9   1/1     Running   0             69s
pod/php-apache-7495ff8f5b-p26g7   1/1     Running   0             53s
pod/php-apache-7495ff8f5b-vhxt8   0/1     Pending   0             8s
pod/php-apache-7495ff8f5b-wzlr4   1/1     Running   0             84s

NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/kubernetes      ClusterIP   10.96.0.1       <none>        443/TCP        4h1m
service/php-apache      ClusterIP   10.100.126.40   <none>        80/TCP         4h
service/php-apache-np   NodePort    10.102.48.181   <none>        80:30708/TCP   3h31m

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/php-apache   5/6     6            5           4h

NAME                                    DESIRED   CURRENT   READY   AGE
replicaset.apps/php-apache-7495ff8f5b   6         6         5       4h

NAME                                             REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/php-apache   Deployment/php-apache   56%/50%   1         10        5          70m

Now stop the load to the application by Ctrl+C in the terminal where busybox was run. After a few minutes, you can see the pods would have autoscaled down, decreasing the number of pods due to less or no traffic to the application.

$ kubectl get all
NAME                              READY   STATUS    RESTARTS      AGE
pod/php-apache-7495ff8f5b-gt8r8   1/1     Running   1 (78m ago)   4h7m

NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/kubernetes      ClusterIP   10.96.0.1       <none>        443/TCP        4h7m
service/php-apache      ClusterIP   10.100.126.40   <none>        80/TCP         4h7m
service/php-apache-np   NodePort    10.102.48.181   <none>        80:30708/TCP   3h37m

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/php-apache   1/1     1            1           4h7m

NAME                                    DESIRED   CURRENT   READY   AGE
replicaset.apps/php-apache-7495ff8f5b   1         1         1       4h7m

NAME                                             REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/php-apache   Deployment/php-apache   0%/50%    1         10        1          76m

Cleanup

Remove the application deployment and service

kubectl delete -f php-apache.yaml

Optionally remove the NodePort service, if created

kubectl delete svc php-apache-np

Remove the metrics server

kubectl delete -f components.yaml

Remove HPAs

kubectl delete hpa php-apache

References

Other articles

Python Common Gotchas
Published: Sun 04 September 2022
Updated: Sun 04 September 2022
By Hephzibah Pon Cellat Arulprakash

In python.

tags: python

This article discusses about common gotchas in python with code snippets describing what someone might think it would work like but how it actually works behind the scenes.

Table of Contents
- Mutable Default Arguments
- Late Binding Closures
- List Copy
- Local Variable
Mutable Default Arguments
```
def append_to(element, to=[]):
    to.append …
```
read more
Kubernetes Architecture
Published: Wed 13 July 2022
Updated: Wed 13 July 2022
By Hephzibah Pon Cellat Arulprakash

In k8s.

tags: k8s

A kubernetes cluster is made up of one or more control plane and worker nodes.

Control Plane

Components of control plane are,
- kube-apiserver: It is the frontend for the control plane exposing the Kubernetes API.
- etcd: Consistent and highly available key-value store used as Kubernetes backing store for all cluster …
read more
Troubleshoot high latency database

Published: Sun 10 July 2022
Updated: Sun 10 July 2022
By Hephzibah Pon Cellat Arulprakash

In troubleshooting.

tags: troubleshoot

Problem Statement

How would you troubleshoot a high latency database application, consider a user is experiencing delays per hits to an application due to database latency.

Troubleshooting

Latency describes the amount of delay on a network or Internet connection. Low latency implies that there are no or almost no delays …
read more
About

Published: Fri 20 May 2022
Updated: Fri 20 May 2022
By Hephzibah Pon Cellat Arulprakash

In About.

Hi, I am Hephzi
read more