Scaling Kubernetes Applications With Replicas

Kubernetes is a container orchestration system for deploying and managing containers. Kubernetes can be used to deploy application components on clusters of servers as well as scaling those applications, but Kubernetes is not the only option available for these needs.

This blog post will show how to use replicas to scale applications in Kubernetes and control how many containers each application will run.


Deployments in Kubernetes are a way of enabling the scalable deployment and management of applications on Kubernetes clusters where multiple instances (or replicas) can be run at once.

Each replica is an exact copy that Kubernetes creates and manages for you. Kubernetes deployments are how Kubernetes takes care of deploying your applications across the cluster, scaling up or down as needed to handle traffic volume changes, ensuring that each deployment is working correctly by running automated tests against it and more.

In the following code example, I have a Deployment that deploys an application into a Kubernetes cluster. By default, the application is deployed with three replicas which means that three containers of the applications are running at any given time.

apiVersion: apps/v1
kind: Deployment
  name: web-deploy
  replicas: 3
      app: web-app
        app: web-app
        - name: web-app
          image: nginx
      restartPolicy: Always  

I will run the following command to check that the application was deployed with three containers.

kubectl get pods


To scale it from 3 containers to 5, I need to change the replicas value to 5 and run the deployment.

kubectl apply -f deployment.yaml