Automation - Terraform - How not to upset me, using Terraform

2023-03-11

Working in tech, often times, knowing how to manually do things is not quite enough. I have some sympathy for not automating things prematurely, but when I am working on a team automating things. I can be particular about what is and is not acceptable. Hopefully if you are starting your journey, this will help you; or form some form of lightweight evaluation.

Changes should be tested in a pipeline

Until recently, I had considered this a given. But if you write terraform, and a colleague cannot find an artifact of a pipeline running that code; or be able to, with very low effort, and little experience, run that pipeline. That really annoys me.

Why is this important

If colleagues cannot run and understand, with minimal effort, your work, then there was no point automating it.

If code is not in a pipeline, it might not be run multiple-times per-year, so it also might be broken when someone comes to use it.

It is incredibly easy for even the author to forget details, or not be able to run code, much past the point of authoring if that is not recorded.

When this does not matter

If you don't have enough money to justify, then you are playing, and can play in whatever way makes you happy.

If you run an open-source project, see the above point. It is not dismissing the value of your project, but not automating things is limiting how many can access the value.

If the terraform is not stable, and might have issues if run; then don't use terraform. Some people publish gutter-trash integrations, and while it makes me sad; Just avoid automating the steps. Create a markdown file instead; and some form of credential tester.

Branch or Repo forks of code

We've all seen this. Constants hard-coded into terraform files; or asymmetric deployments. There is never an excuse to have one system, with more than one set of terraform.

Tips

Turning things off can be done as simply as passing in a terraform variable for replicas, set to 0.
Newer terraform (past 1.x) supports richer inputs, including booleans.
You can also specify remote state to pull from
You can use environment variables and command-line arguments to pass terraform variables
Using control-flow, and getting to know HCL can help

All of these are suggestions, which can help to ensure things weave together without issue.

Why is this important

If you don't have one place to resolve all branches of code; then you are managing multiple projects.

Managing multiple projects around a single codebase, or worse still, multiple branches of codebases (so multiple combinations of multiples) is a recipe for pain.

Amplified by the above two points. It is difficult to keep complex forked architectures in your mind.

Practical tip

By using terraform variables you can pass data from non-terraform resources, to terraform.

By using command-line arguments you can even customise locks and state output files.

Take the following example from a real project.


setup-rds-aurora:
  stage: stage-02
  dependencies:
    - fetch-terraform
    - setup-eks
  before_script:
    - ./get_job_env_vars stage-02 ${TF_VAR_ENV_NAME}
  script:
    - ./terraform -chdir=stages/02-stage-two/aws-rds-aurora init -backend-config="bucket=${TF_VAR_STATE_BUCKET_NAME}" -backend-config="key=${TF_VAR_ENV_NAME}/rds.tfstate" -backend-config="dynamodb_table=iac-state-lock"
    - ./terraform -chdir=stages/02-stage-two/aws-rds-aurora apply -auto-approve
  tags:
    - docker
  only:
    - main

It is inferred that lots of TF_VAR_{} values will be set by a helper script called get_job_env_vars.

With minimal intervention via use of storage backend variables, you can change:

The S3 bucket that .tfstate files are stored within.
The path to S3 object .tfstate file.
The shared lock-table used to prevent the same job running twice, mutating the same state.

Other arguments used are the AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY. This specific job also takes other arguments for the name of the RDS, variables for the state bucket name TF_VAR_STATE_BUCKET_NAME and it's region. Anything you might want to configure would be a TF_VAR environment variable away. Then, just like how you use environment to configure your apps, you can configure your deployments in the same way.

Lack of use of modules

Terraform has very few concepts. Modules are as far as I've come accross, the best way to abstract shared pieces of infra.

If You are not using modules, then you are likely either managing a simple setup that won't scale, and may not need terraform; or you have one big tangled mess.

How to move towards modules

My advice to those not using modules, or working in one large module, is to try to pull parts out. So if you have a cloudfront + S3 + DNS, put that into a module.

It can be quite hard to get modules to their nth iteration without some moving of parts, but this is all very useful experience and an exercise in understanding the needs of your business.

Useful links

Use of only third-party modules

This one I'll keep pretty short. I Find it highly unlikely that using off the shelf modules won't cause you pain. Either in using modules with inappropriate resources, or in needing to spend time learning intimate relationships of underlying providers.

Terraform has excellent documentation on some of the more well-trodden providers, such as AWS. Using such documentation can help you understand where additional complexity comes from; and will hopefully help you to refine public third-party modules into something you feel fits your lean requirements.

Unclear or large commits

When coding on my current team we use conventional commits via an npm module. We rarely get a chance to touch the terraform; and I guess that is fine. When we do, I try to keep us committing small, intentional, well-named code.

Initial commit
feat: add base structure
feat: add k8s + vpc module
feat: add bootstrap setup with documentation
fix: k8s + vpc to use bootstrap to store state
etc...

By working in this way; it can be easier for a new-onboard or in the case of the terraform I took over, the team that manages our infra (don't ask) to understand what we are proposing. When I joined the team, the boilerplate left behind was all the code, no pipeline, commits were unclear and that left us in a place where everyone was a little uncertain about how things work.

By Lewis Cowles