Kubernetes Series Part 7: Scaling Your Applications with Horizontal Pod Autoscaler (HPA)
Table of Contents
Kubernetes Series Part 7: Scaling Your Applications with Horizontal Pod Autoscaler (HPA)
In our previous post, we discussed how to manage configuration data and secrets within Kubernetes. Now, let’s explore how to scale your applications dynamically using the Horizontal Pod Autoscaler (HPA).
Understanding Horizontal Pod Autoscaler (HPA)
The Horizontal Pod Autoscaler (HPA) automatically scales the number of Pods in a deployment, replica set, or stateful set based on observed CPU utilization, memory usage, or other select metrics. This ensures that your application can handle varying loads efficiently without manual intervention.
How HPA Works
HPA continuously monitors the specified metrics of your application and adjusts the number of replicas to maintain the desired performance. It uses the Kubernetes Metrics Server to collect resource utilization data and makes scaling decisions based on predefined thresholds.
Hands-on Labs
Prerequisites
Ensure that the Metrics Server is installed in your cluster. You can install it using the following command:
minikube addons enable metrics-server
Lab 1: Create a Deployment
First, create a deployment for your application. For this example, we’ll use an Nginx deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-hpa
spec:
replicas: 1
selector:
matchLabels:
app: nginx-hpa
template:
metadata:
labels:
app: nginx-hpa
spec:
containers:
- name: nginx-hpa
image: nginx:latest
resources:
requests:
cpu: 100m
limits:
cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
name: nginx-hpa-service
spec:
selector:
app: nginx-hpa
ports:
- protocol: TCP
port: 80
targetPort: 80
name: http
type: ClusterIP
Apply the deployment:
kubectl apply -f nginx-deployment.yaml
Lab 2: Create an HPA Resource
Create an HPA resource to scale the Nginx deployment based on CPU utilization:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-hpa
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 10
Apply the HPA resource:
kubectl apply -f nginx-hpa.yaml
Testing HPA
To test the HPA, you can generate load on the Nginx deployment using a tool like kubectl run
:
kubectl run -i --tty load-generator --image=busybox /bin/sh
Inside the load-generator Pod, run the following command to generate CPU load:
while true; do wget -q -O- http://nginx-hpa-service.default.svc.cluster.local; done
The domain nginx-hpa-service.default.svc.cluster.local
is a DNS name that Kubernetes uses to allow Pods to communicate with each other within the cluster. Here’s a breakdown of the components:
nginx-hpa-service
: This is the name of the Kubernetes Service you created.default
: This is the namespace where the Service is located. If you didn’t specify a namespace, it defaults todefault
.svc
: This indicates that the DNS name is for a Service.cluster.local
: This is the default domain for services within the cluster.
In Minikube, Kubernetes sets up an internal DNS service that resolves these names to the appropriate IP addresses of the Services. This allows Pods to find and communicate with each other using these DNS names. Monitor the HPA status:
kubectl get hpa
You should see the HPA scaling the number of replicas based on the CPU utilization.
Conclusion
The Horizontal Pod Autoscaler is a powerful feature in Kubernetes that helps you maintain optimal performance and resource utilization for your applications. By automatically adjusting the number of Pods based on real-time metrics, HPA ensures that your applications can handle varying loads efficiently.