Ninja Docs Help

Zero downtime upgrade strategies

Revision

Date

Description

1.0

26.08.2024

Init document

Blue/Green deployment

The blue-green deployment strategy involves running two versions of an application simultaneously to ensure availability and minimize risks during deployment:

  • Blue Environment: This environment runs the existing version of the application, representing the current production environment.

  • Green Environment: This environment runs the new version of the application, representing the updated or enhanced version.

With the blue-green deployment approach, you launch the green environment and direct live traffic to assess its performance. You can effortlessly switch the traffic back to the blue environment if any issues arise. On the other hand, if everything functions as intended, the green version becomes the new production environment.

For blue/green deployment options AWS recommends using Argo CD which is a continuous delivery tool designed for Kubernetes that follows the GitOps approach.

ArgoCD requirements

To use Argo CD on the existing EKS cluster we will have to:

  • Request a public certificate with AWS Certificate Manager

  • Create a public hosted zone in Route 53

  • create password in SecretManager for ArgoCD UI

  • RBAC and Access Control: Argo CD requires appropriate RBAC (Role-Based Access Control) configurations to control access and permissions for managing deployments. You may need to review and update your current RBAC policies and roles to accommodate Argo CD's requirements and ensure proper access controls.

  • ArgoCD will require modification of deployments yaml of current applications in the GIT repository to be fully consistent with the new setup

Pros and cons of Blue/Green deployment

  • No maintenance window - we don’t need to plan maintenance windows and downtime to deploy an update.

  • Safety - Blue-green patterns tend to be much safer because the updated instance doesn’t come into contact with any currently running instances. This isolation protects the working live version from breaking and reduces the likelihood of a rollback

  • High Infrastructure Costs - With two environments running and multiple instances of any services used, blue-green patterns tend to have much higher infrastructure costs than other deployment methods.

  • Multiple database management - Blue-green patterns require more careful consideration when it comes to database management. Managing your database or databases is more complicated when you have parallel production environments. Any changes made in an update must appear in both environments for the pattern to work correctly.

Rolling update

A rolling update in Amazon Elastic Kubernetes Service (EKS) refers to a deployment strategy that allows for the seamless and controlled update of containers or pods within a Kubernetes cluster. This update strategy ensures that the application remains available to users throughout the update process.

In a rolling update, new instances of the updated containers or pods are gradually introduced into the cluster while the old instances are gradually terminated. This approach minimizes disruptions by ensuring a smooth transition from the old version to the new version of the application.

During a rolling update in EKS, a configurable number of pods or containers are updated at a time, based on the defined update parameters. This gradual approach helps in detecting any potential issues or errors in the updated version before it is fully deployed, allowing for quick rollback if necessary.

Pros and cons of rolling update

  • Monitoring and Rollback - we can perform health checks during an update frequently. Health checking allows us to monitor the rollout as it happens. Since a rolling pattern is gradual, we can monitor our new environments while the old ones are still updating. So, we can quickly stop the rollout and roll back to a previous version during this process if we encounter an issue.

  • Low service use - By encapsulating rolling patterns within a single environment, the total number of services operating during development is reduced. For example, if our application relies on five services to maintain online functionality, a rolling deployment would ensure that only those five services are utilized, with each service functioning as a single instance.

  • Speed - Rolling updates take more time to complete compared to other update strategies, as each instance is updated individually

  • Limited Rollout Control - While rolling updates provide control over the update process, they may have limited control over the order in which instances are updated across multiple availability zones or regions.

Canary deployment

Canary deployment in Amazon Elastic Kubernetes Service (EKS) is a strategy for safely rolling out updates or new versions of applications by gradually directing a subset of traffic to the updated version while keeping the majority of traffic on the stable version.

In a canary deployment upgrade in EKS, a small portion of user traffic is routed to the new version of the application, known as the "canary" version, while the remaining traffic continues to be served by the stable version. This allows for real-time monitoring and evaluation of the new version's performance and stability before exposing it to the entire user base.

Canary deployments in EKS provide a controlled and incremental approach to updates, minimizing the impact of potential issues or failures. They allow for thorough testing and evaluation of new versions before fully exposing them to the user base, ultimately ensuring a smoother and safer deployment process.

Pros and cons of canary deployment

  • Controlled Rollout - canary deployments offer a controlled and gradual rollout process, allowing for careful monitoring and evaluation of the new version before impacting the entire user base.

  • Rapid Rollback - if issues are identified during the canary deployment, rolling back to the stable version is relatively quick and straightforward.

  • Real-Time Performance Monitoring - canary deployments enable real-time monitoring of the new version's performance and stability. Metrics, logs, and user feedback can be analyzed to assess the impact on key performance indicators and user experience

  • Increased Complexity - canary deployments introduce additional complexity to the deployment process. Setting up proper traffic splitting, monitoring mechanisms, and managing multiple versions of the application require careful configuration and coordination.

  • Increased Deployment Time - due to the gradual rollout process, canary deployments generally take more time to complete compared to other deployment strategies

  • Implementation Challenges - implementing canary deployments may require additional tooling or infrastructure to handle traffic splitting and monitoring effectively. This can add complexity and may require specific expertise or additional resources.

Last modified: 17 February 2025