/ miscellaneous

Continuous integration with argo-events

Background

Microservices have taken the world by storm, every new organisation these days primarily aim to use the microservice architecture to solve the business problems. Now one might argue that microservices have problems associated with them, although that’s partially true, that’s not the focus of this blog post, rather it’ll be around a practice called Continuous integration with GitOps.

What is Continuous Integration ?

Microservices revolve around the fundamental principle of deployment to production or any other environment for that matter multiple times a day, when compared to that of a monolith which usually gets deployed only once in a sprint. This is achieved using Continuous Integration. In this sort of practice, developers merge their code to a central repository on a frequent interval(depends on the feature completion), once the merge is done a set of automated tasks are run. These automated tasks might broadly fall under the following categories

  1. Pre-Build
  2. Build
  3. Post-Build

Building in this context refers to the process of creating an executable from the source code of the central repository, this can be as simple as creating a Jar file/ creating a golang executable/ building a docker image etc. But the main point is that this entire process automated. Speaking of automation, this can be as simple as a shell script which will execute certain commands, or something very complex as well.

Pre-Build

pre-build stage broadly speaking, consists of the stage which will bootstrap the build environment, say for example, to build a golang executable, we will have to have an environment which has golang installed, bootstrapping this golang environment will be part of the pre-build stage

Build

build stage takes care of building the application executable, before building there can be certain pre-conditions which have to be satisfied, like running unit-tests, linting, scanning for vulnerabilities etc. All of these steps fall under the build stage, An example stage could consist of the following steps

  1. Lint the codebase to check code style violations
  2. Run unit tests, integration tests
  3. Run static code analysis to check for code smells
  4. Run vulnerability analysis for both code/docker images which get built
  5. Build the application executable/docker image

Post-Build

post-build is where the built output gets pushed to a central artifactory or image registry. In certain organisations, people might do deployment as well in this stage, however it is highly recommended avoiding doing so, because your build tool or the software shouldn’t have any knowledge of how you are deploying your application. For example, assume that we are deploying to k8s and we use helm/kustomize/jsonnet for templating our application manifests. If we were to add the deployment step into the post-build stage, we’ll either have to use kubectl apply directly or create a service account with the required RBAC for helm/kustomize/jsonnet to deploy to the respective environment. Thus coupling the build tool with the deployment environment. Usually the post-build stage consists of the following steps

  1. Push the build jar/docker image to artifactory/docker registry
  2. Run database migrations if there are any

GitOps

What is GitOps?

GitOps is a way to manage kubernetes cluster and application delivery on kubernetes. It works by using Git as a single source of truth for declarative applications and infrastructure. With GitOps, we can use software agents like argoCD to compare the current state - Number of replicas/service definition etc., in the cluster and the desired state within the git repository, and if there’s a difference in the state, the agent will use the kubernetes control plane api to either update or rollback the cluster to eliminate the difference in state.

In order to manage our cluster with GitOps workflows, there are certain pre-requisites

  1. Declarative description of the desired system state - Usually when we are deploying to kubernetes, we deal with a lot of manifests(yaml files), which describe the desired system state. In order to start following GitOps principles, we have to start pushing all these manifests into a central repository so that the repository itself is going to serve a single source of truth.

Advantages of using GitOps

  1. With the declaration of the system state stored entirely in a version control system, serving as a single source of truth, we have a single place from which all the manifests are derived and driven from. This trivializes rollbacks, where we can use git commands like git revert to go back to our previous stable or desired state from a given state. We all know how excellent git is for tracking version changes to code bases, and if we start version controlling our application manifests in git, we get the best of both worlds.
  2. Once the state is pushed to git, the next step is to allow any changes to the desired state to be automatically applied to the system as a whole, in order to achieve this, we can use external software agents like argoCD, the significant advantage with this approach is that we don’t require cluster credentials to make the changes to our system, this is achieved by creating a service account for external agents like argoCD with the necessary RBAC, thus allowing us to separate what we do from how we do it.
  3. Since the desired state is pushed to git, we can have software agents like argoCD to inform / alert us when there is a mismatch of state. For example, let’s assume that we desire 4 instances of an application to be running at all point in time, and for some reason one of the instance/pod goes down, the agent can either bring up another instance or send an alert over an official communication medium(like slack), regarding the divergence.
  4. Better time to production - With CI and CD getting automated, the time to deployment is reduced, thus enabling teams to ship code more than once a day to production

Argo-events - Using GitOps principles for CI

Argo-events is an event driven workflow automation tool for kubernetes, which enables us to have triggers on events from variety of event sources.

Components within argo events

Argo-events as a whole consists of a set of components which have to be wired together to get something useful - in our case CI,

  1. Event Sources - As the name suggests these are external systems which are going to be emitting events. Events in this context refers to some activity which has occurred. For example, a pull request getting merged, a commit happening into main branch etc. These are custom resource definitions within argoproj namespace.
  2. Event Bus - Event bus is the channel through which all the events from the event sources(producers) flow into. On the receiving end, we’ll have another component called sensors. But the key point to note is that event bus is the channel through which events from event sources flow.
  3. Sensors - Sensors are components which listen on the event bus for events. Once an event gets received, it triggers a event trigger.
  4. Trigger - A trigger is some action for an event, for example we can have a trigger which runs a serverless function or run a custom workflow etc. Out of the box, argo events supports a lot of triggers the most notable ones are, if your use case doesn’t fall into any of the following triggers, we have an option to build a custom trigger as well.

    1. Argo Workflow Trigger
    2. AWS Lambda
    3. HTTP
    4. NATS
    5. Kafka
    6. Kubernetes Object
    7. Log
    8. OpenWhisk
    9. Slack
    10. Azure event bus
    11. Pulsar

Argo Events
Argo Events

As we can see from the above image, argo events supports a variety of event sources, the most important among them for us in this context would be git.

For the CI use case, we usually will have the following setup

  1. An event source created against our version control system of choice(either github/gitlab/bitbucket).
  2. An event bus with the name default for a namespace to which we are going to deploy to.
  3. A sensor which is going to trigger a custom workflow
  4. A workflow with a set of steps which take care of the CI
    CI Setup
    CI Setup

The workflow in turn consists of the following jobs

  1. Clone the source code repository
  2. Run Lint
  3. Run tests
  4. Run database migrations
  5. Build and push the docker image to registry
  6. Deploy the application using progressive delivery(canary/blue-green)
    Workflow Definition
    Workflow Definition

Actual Deployment Screenshots

Pipeline
Pipeline

Workflow
Workflow

Argo Workflow

Argo Workflows is an open source container native workflow engine for orchestrating parallel jobs on kubernetes. This is again implemented as kubernetes CRDs. Getting into what argo workflow is a subject for another blog post, but at a 1000 ft bird’s eye view, argo workflow enables us to define jobs within a workflow, each job is nothing but a container with its set of custom arguments. For example, we can have a job, which will be cloning the source code repository inside an ephemeral container, which will get picked up by another container to execute tests and so on.

Happy Coding!

Kumar D

Kumar D

Software Developer. Tech Enthusiast. Loves coding 💻 and music 🎼.

Read More
Continuous integration with argo-events
Share this