Tech Insights
Ravindu Sandeepa Rathugama
January 24, 2022

Automated Manifest File Validation Using Open Policy Agent and Git Hub Actions

Automated Manifest File Validation Using Open Policy Agent and Git Hub Actions

In today’s cloud-native world, manifest files are widely used. In most cases, these files needed to be validated to ensure that they contain all the essential fields with accepteddata types and formats. In this article, I’m going to show you how to validate manifest files using Open Policy Agent and automate this using GitHub Actions.The same method that I show here can be used to enforce standards in CI/CD aswell.

This whole implementation contains two parts.

·        Writing policies using Open Policy Agent and its Rego language to validate manifestfiles.

·        Automating validation using GitHub actions so that when changes happen in manifest files,it automatically validates.

So let’s dive intoactual implementation.

What we validate

We are going to validate the following test-manifest.yaml file.

apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
spec:
containers:
  -  name: nginx
  image:nginx:1.1
  ports:
    -  containerPort:80

Following are the policies that enforce on this manifest file.

1.       The kind field cannot be null.

2.     The kind fieldmust be a string

3.     Container images with the latest tag are not allowed

If a manifest fileviolates any of the above policies, it is considered an invalid manifest.

Note that the 3rd policyis not just a file content validation. It controls the allowed containers for deployment. This is an example of how infrastructure manifests can be validated to enforce standards in CI/CD.

Open policy agent and Rego language for policy validation

Open Policy Agent Logo

Open Policy Agent, or OPA, is an open source, general purpose policy engine(a software component that allows users or other systems to query policies for decisions). The main advantage of this kind of policy engine is it decouples policy decisions from business/application logic which gives us several benefits. In here I assume you are already familiar with open policy agent and its Rego language at least to some extent. If not would be nice to give it a shot because it’s becoming the de facto standard when comes to policy validations (including authorizationin cloud native stack).

In Open Policy Agent,the rules are written using a language called Rego. So let’sstart writing policies as codes using Rego. The following policy.rego file contains all the policies that enforce on the above manifest file.

package manifest

# invalid manifest if 'kind' field is null
invalid[msg]{
  is_null(input.kind)
   msg :="error::kind cannot be null"
}

# invalid manifest if the value of 'kind' field is not astring
invalid[msg]{
   notis_null(input.kind)
   notis_string(input.kind)
   msg :="error::'kind' must be a string"
}

# cannot use containers with 'latest' tag
invalid[msg]{
   container :=input.spec.containers[_]
  endswith(container.image, ":latest")
   msg :="error::containers with 'latest' tag are not allowed"
}

invalid is the Rego rule that checks the validity of a manifest file. The statements inside this rule are combined with AND operator. Also,note the usage of the same rule name invalid. This is the logicalOR operation in Rego. To express logical OR in Rego, you define multiple rules with the same name (rules with the same name are called incremental rules).Also, it’s worth knowing about the [msg] part in the rule header. InRego, there are two types of rules; complete rules and partial rules. Documents(thinks of this as a variable that stores the outcome of the rule) produced by rules with complete definitions can only have one value at a time. In contrast,partial rules generate a set of values and assign that set to a variable. Herethis square brackets “[ ]” make this as a partial rule. That means the variable msg can have a set of values. Putting it all together,in this case, I have used logical OR with partial rules, So each rule definition contributes to the set of values assigned to the variable msg.

Now our test-manifest.yaml file can be validated against these policies. There are several methods that you can run OPA. You can directly download the OPA binaries and run them or you can run it using the OPA docker image. Here I’m running OPA using binaries and validate the file using opa eval command.

opa eval -dpolicies/policy.rego -i manifests/test-manifest.yaml data.manifest.invalid--format=pretty

If the validation test is passed it should return an empty array([ ]). The following figures show some outputs of the above command when the validation test is failed.

output of opa eval command when the kind field of test-manifest.yaml file is null

output of opa eval command when the container image specified in test-manifest.yaml file containthe latest tag

output of opa eval command when both of the above are present in the manifest file

Perfect! Now we can validate manifest files using OPA. But wait, there is a problem. Each time whena file is changed we have to validate it manually. Further, think about the actual situation in your organization. Do you only have one lone manifest in yourproject? no right? There could be hundreds of manifest files in one project. Soit won’t be a good idea to do this manually for each file. So what can we do to solve this problem?

yeah! that’sright..automate the process…….

GitHub Actions to automate policy validation

So let’s automate this policy validation process so that when manifest files are pushed to GitHub, it automatically validates. For that GitHub Actions can be used.

Here I’m going to create a custom GitHub Action. There are mainly two types of Git Hub Actions; DockerActions and Java Script Actions. But here I’m going to create a Docker Action.In brief, a docker action creates a docker container and executes a code that performs the intended task of the action inside it. To create a docker action,there are several steps to follow.

·        Write the action code

·        Create an action metadata file

·        Create a Dockerfile

So let’s start with the action code. Let’s write the bash script that runs in the docker container when the Action is executing.

#!/bin/bash

set -e
isFailed=false

MANIFEST_FILES=`find "${MANIFESTS_PATH}" -typef \( -name "*.yml" -o -name "*.yaml" \)`

# validate all manifest files
for FILE in $MANIFEST_FILES
do
validation_result=`opa eval -d "${POLICIES_PATH}" -i $FILEdata.manifest.invalid --format=pretty`
 if ["$validation_result" != "[]" ]; then
   echo"FAILED::$FILE::$validation_result"
   isFailed=true
 else
   echo"PASSED::$FILE"
 fi
done

# check failed -> non zero exit code
if [ "$isFailed" == "true" ];
then
 exit 1
fi

In this script first,all the YAML files (manifest files) are gotten using the find command.Then each of these files is validated using the opa eval command. Ifthere is any file that is invalid then process exit with a non zero exit code.

Next, let’s write the action.yml file. This is the file that contains the meta data of GitHubaction. In other words, the data in this file defines the inputs, outputs, andmain entry point for your action. In the following file, POLICIES_PATH is the path to the folder that contains the Rego policy files and the MANIFEST_PATH isthe path to the folder where our manifest files reside. These values can be accessed as environment variables inside the running Docker container. Also,note that both of these are mandatory data(required=true) that must be provided when the action is called.

description: 'github action to validate manifest files'
inputs:
 POLICIES_PATH:
   description:'path to opa policies'
   required: true
 MANIFESTS_PATH:
   description:'path to manifest files directory'
   required: true
runs:
 using: docker
 image:'Dockerfile'
 

Putting it all togetherwe have now defined a custom GitHub action that runs the above entrypoint.sh bash script inside a docker container. So now we need a Docker file that defines the required Docker image. The main requirement is Open Policy Agent binaries must be installed in the docker container. Also, it’s worth noting the docker multi stage build I have used here. It is a way to create small and secure docker images.

# multistage build

# stage 1
# install binaries
FROM alpine AS binary-builder
RUN mkdir -p /binaries
RUN apk add --no-cache curl \
   unzip

RUN curl -L -o /binaries/opa https://openpolicyagent.org/downloads/latest/opa_linux_amd64


#stage 2
# action image
FROM ubuntu:18.04
COPY --from=binary-builder /binaries/opa /usr/local/bin
RUN chmod 755 /usr/local/bin/opa
RUN opa version
COPY entrypoint.sh /
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
CMD ["opa version"]

The final step is to create the GitHub workflow that executes our custom GitHub action when an event ccurs. According to this workflow definition, the manifest validation job executes when file changes are pushed to the GitHub repository. It uses the custom Git Hub action we defined above which is stored in the master branch ofthe automated-manifest-validation repository. Also, note that this workflow file must bein the .github/workflows folder in your GitHub repository.

name: Validate Manifest Files
on:
 push:

jobs:
manifest-validation:
   runs-on:ubuntu-latest
   steps:
     - uses:actions/checkout@v2
     - name:Validate Files
       uses:ravindu-san/automated-manifests-validation@master
       env:
        POLICIES_PATH: policies/
        MANIFESTS_PATH: manifests/

Now we are done with the implementation. Let’s push these files to GitHub and see what will happen. To see the workflow runs, go to the Action section ofthe GitHub repository and click on the run you wish to know the details. The following figure shows the sample output if the validation test is failed.

(Find the complete implementation here)

I think now you have some idea on how the manifest validation can be done using Open Policy Agentand how that can be automated using GitHub actions. This is only one way of doing this task. But there are many other ways that you can do this task depending on your requirements and environment. As an example, to achieve truepolicy decoupling you can run OPA as a central service and query it using different applications. Below you can find some additional resources that youmay find helpful.

Additional Resources

·        OpenPolicy Agent documentation

·        Styra — What is Open Policy Agent

·        GitHub Action documentation

·        Enforcing Standards in CI/CD Using Open Policy Agent — Gaurav Gajkumar Chaware,InfraCloud