Nowadays Terraform is one of the pioneer tools used to manage modern infrastructure. It provides a declarative way to provision infrastructure, i.e Infrastructure as Code. IaC is just a code at the end, so it's handled -almost- the same way.
With the rise of the DevOps mindset, the shifting left became essential pillar in the software life cycle.
To ensure the quality of the code it probably will be part of CI/CD pipeline, which has a couple of steps like validating, formatting, linting, and testing.
Finally, all changes will be applied automatically in a GitOps style.
Updates:
- 27.10.2021: Added a new section about securing Terraform with TFSec.
ToC
1. Intro
When you work with Terraform or any IaC in general, you probably are one of two, a producer/upstream or a consumer (also sometimes both!)
You could be a producer when you create a Terraform module and share it publicly on Terraform registry or in-house to be used by other teams. As a consumer, you simply use Terraform in your daily work, for example, use Terraform to build your infrastructure. In both cases probably you need a couple of checks.
There are 5 main checks:
- Validate: To make sure that IaC is syntactically valid.
- Format: To make sure that IaC files have the same format.
- Lint: To make sure that IaC uses specific practices and conventions.
- Secure: To make sure that IaC follows security best practices.
- Test: To make sure IaC is functionally valid.
Those checks play a key role as your team is growing. First 3 checks are done either as producer or consumer. The last one (testing) is more likely to be as producer/upstream.
Terraform has built-in support for formatting and validating TF files. But for a long time, there was a lack of a good tool for linting and testing. In the next section, we will have a look on available options for each one.
2. Format
Formatting is about style like using tabs or spaces and how many of them and so one.
As mentioned above, Terraform has a built-in command for formatting. All that you need is Terraform itself.
terraform fmt -recursive .
It will search for all Terraform files like ".tf" and ".tfvars" and rewrite files in-place to a canonical format.
So something like this (notice the format of brackets and the space before equal sign):
resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" }
It will be:
resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" }
The command fmt has some options regarding how to format, like format in-place or just diff and so on. That's all you need for formatting Terraform IaC.
3. Validate
Validation is about syntax like blocks inside TF files. For example, resource blocks always have 2 labels (type and name).
The same as formatting, Terraform has a built-in command for formatting. All that you need is Terraform itself.
terraform validate .
Terraform validates the syntax of TF files and it doesn't access any remote services. So it's safe to run it anytime.
If you created a resource block with 1 label, validate command will print an error and you need to fix it. For example, this will not work (no name after resource type "aws_instance"):
resource "aws_instance" { ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" }
That's all you need for validation.
4. Lint
Linting is about practices and conventions. For example, best practices of Terraform resource naming.
You can set how long a resource name should be, the charters used in the names like underscores should be used and not dashes, or maybe you want to make sure that all S3 buckets resources are using server_side_encryption_configuration, or your EC2 instances should or shouldn't be of some types.
For a long time, Terraform didn't have a proper linter, either they were limited or complex. But nowadays there are 2 good options for that.
4.1 Config-lint
config-lint is a linting tool with support for Terraform. It has built-in Terraform rules and it's easy to use.
What I like about it:
- It uses YAML for rules, so it's pretty easy to write custom rules.
- It supports many operations.
- It supports Terraform v11 and v12 syntax.
What I don't like about it:
- Its development is active but a bit slow I think (probably it just needs more momentum).
4.2 TFLint
tflint is a generic Terraform linter where it focuses on general/static problems rather than custom/dynamic problems.
What I like about it:
- It has a lot of built-in rules.
- Due to its big static ruleset, it can catch some logical issues like the wrong EC2 type (e.g. a typo like t2.microo).
What I don't like about it:
- Custom linting rules need to be written in Golang, and compiled as binary to be used! (that was a deal-breaker for me!)
- Most of its built-in rules are for AWS only.
- It supports Terraform v12 syntax only.
...
I believe that config-lint is the best option for Terraform linting at the moment! Let's have a look at an example that covers above cases.
cat << EOF > tf-lint-example.yml --- version: 1 description: Linting example for Terraform. type: Terraform files: - "*.tf" rules: - id: TF_RESOURCE_NAMING_CONVENTION message: "Resource name should be: not more 64 chars, starts with a letter, doesn't have a dash, and ends with letter or number" severity: WARNING category: resource assertions: - key: __name__ op: regex value: '^[a-z][a-z0-9_]{0,62}[a-z0-9]$' tags: - terraform - terraform.blocks # This rule should be split to 2 rules, but it's just for demonstration. - id: AWS_S3_BUCKET_AND_OBJECT_HAS_SERVER_SIDE_ENCRYPTION severity: FAILURE category: resource resources: - aws_s3_bucket - aws_s3_bucket_object assertions: - or: - key: server_side_encryption_configuration op: present - key: server_side_encryption op: present tags: - aws - s3 - id: AWS_EC2_TYPE message: Instance type should be t2.micro or m3.medium severity: FAILURE resource: aws_instance assertions: - key: instance_type op: in value: t2.micro,m3.medium tags: - aws - ec2 EOF
Now is to run config-lint in the same dir of Terraform config with that lint rule:
# docker run -v $(pwd):/data -w /data stelligent/config-lint \ -tfparser tf12 \ -rules tf-lint-example.yml \ terraform/production
With this resource:
resource "aws_instance" "ec2-machine" { ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" tags = { Name = "HelloWorld" } }
The output would be something like this:
[ { "AssertionMessage": "__name__(ec2-machine) should match ^[a-z][a-z0-9_]{0,62}[a-z0-9]$", "Category": "resource", "CreatedAt": "2020-04-04T04:04:04Z", "Filename": "main.tf", "LineNumber": 1, "ResourceID": "ec2-machine", "ResourceType": "aws_instance", "RuleID": "TF_RESOURCE_NAMING_CONVENTION", "RuleMessage": "Resource name should be: not more 64 chars, starts with letter, doesn't have dash, and ends with letter or number", "Status": "WARNING" } ]
config-lint is a strong and flexible tool, you can write all rules that you need to ensure quality and consistency of your Terraform IaC all the time.
5. Secure
Securing is about security practices. Shift-Left gained a lot of popularity in the software industry in the last 10 years, and security wasn't an exception. In fact, that concept has been always there in the security field; however, like many other software practices, the old way didn't work; it was just blocking and delaying the software developing process.
The more of shifting left, the more adoption of what's known as DevSecOps, which simply means including the security as part of the software life cycle instead of making it as post-action. Hence, more tools appeared to cover this area and ensure that Terraform IaC follows security best practices.
The best tool so far for that purpose is TFSec, an open-source static security scanner for Terraform files. It has many great features. It's superfast, great to-the-point reports, focus on all major cloud providers, integrates well with CI pipelines, and many more. It's simply built for humans, not aliens!
Now let's run TFSec against an example of the output from Google Cloud SQL resource:
# docker run --rm -it -v "$(pwd):/src" tfsec/tfsec /src
The output would be something like this:
[REDUCED] Result 8 [google-sql-no-public-access][HIGH] Resource 'google_sql_database_instance.postgres' authorizes access from the public internet /src/sql.tf:26 23 | } 24 | 25 | authorized_networks { 26 | value = "0.0.0.0/0" string: "0.0.0.0/0" 27 | name = "internet" 28 | } 29 | } Impact: Public exposure of sensitive data Resolution: Remove public access from database instances More Info: - https://tfsec.dev/docs/google/sql/no-public-access#google/sql - https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/sql_database_instance - https://www.cloudconformity.com/knowledge-base/gcp/CloudSQL/publicly-accessible-cloud-sql-instances.html times ------------------------------------------ disk i/o 214.659µs parsing HCL 5.766µs evaluating values 47.545µs running checks 1.92192ms counts ------------------------------------------ files loaded 1 blocks 1 modules 0 results ------------------------------------------ critical 0 high 2 medium 6 low 0 ignored 0 8 potential problems detected.
It's to the point and shows where exactly the problem and how it could affect you. TBH, TFSec is a great too that should be integrated with your local IDE as well as in every CI pipeline. It will enhance your security by default.
6. Test
Testing is about functionality. For example, when you write a Terraform module to create an S3 bucket with a certain policy, you want to be sure it actually has created everything as expected.
In programming language there are unit-test and integration test. In unit-test, code parts are tested individually (most of the time on function level). In integration test, code units are combined and tested as a whole.
In infrastructure as code pure unit-test doesn't make much sense because IaC is about interacting with external systems. So most of the time testing IaC is a mix between unit-test and integration test. Where for example modules are tested alone and as part of a bigger system.
As I mentioned before, most of the time this kind of testing is for producer/upstream (e.g. when you develop a Terraform module for public use) not for consumers (when you use Terraform in your daily work).
As a SaltStack formula maintainer, I had an experience with that before. 2 years ago I wrote about KitchenCI and testing Infrastructure as Code. So for Terraform, there are 2 main options for testing.
6.1 Terratest
Terratest is a Go library by Gruntwork (the company behind Terragrunt) that helps you to write a test for Terraform IaC.
What I like about it:
- Very flexible.
- Beside Terraform, it supports other systems like Packer, Docker, and Kubernetes.
What I don't like about it:
- Not declarative.
- It needs to write real code, and sometimes a lot of it!
6.2 Kitchen-Terraform
Kitchen-Terraform is a KitchenCI plugin for testing the Terraform IaC. It's simply a driver for KitchenCI to run and apply Terraform, then test the outcome using InSpec.
What I like about it:
- Easy to use as part of KitchenCI system.
- Semi declarative. Because it uses KitchenCI and InSpec.
What I don't like about it:
- Ruby code! I just don't like Ruby stuff u_u
- You still need to understand and deal with KitchenCI and InSpec (which's not that bad after all).
...
If I develop a Terraform module probably I will choose Terratest. Let's take a look at an example. I will just copy the example from its website.
Here is the "output.tf" file:
output "hello_world" { value = "Hello, World!" }
And here is a test where it applies the module "terraform-hello-world-example" and then checks the output.
package test import ( "testing" "github.com/gruntwork-io/terratest/modules/terraform" "github.com/stretchr/testify/assert" ) func TestTerraformHelloWorldExample(t *testing.T) { terraformOptions := &terraform.Options{ TerraformDir: "../examples/terraform-hello-world-example", } defer terraform.Destroy(t, terraformOptions) terraform.InitAndApply(t, terraformOptions) output := terraform.Output(t, terraformOptions, "hello_world") assert.Equal(t, "Hello, World!", output) }
As it's shown, it's just pure code! However, when you make a module consumed by many users, it becomes more important to have something like this to ensure quality of your IaC.
7. Apply
So at this point all checks should be passed, and the actual change needs to be applied. Here comes GitOps which's a way to manage operational workflows using Git. It's the final part of the pipeline, continuous delivery.
Atlantis is a GitOps tool to automate Terraform. Simply put it watches the changes in a git repo, wait for changes which are done via pull-request in a GitOps style, then it runs terraform plan, and if the change looks good, and after the confirmation (or not!), it runs terraform apply.
Atlantis provides better visibility on the pull request, which helps better collaboration and standardization of Terraform workflow.
I will not drive too much about Atlantis because it's more about implementation, but it's your way to go for automating Terraform.
8. Conclusion
- Validating, formatting, linting, and securing Terraform IaC are a mandatory part of any CI pipeline nowadays. Especially linting when you have a bigger team (actually just > 2).
- On the other hand, testing is more about upstream, when you develop Terraform modules or when you have a strict working environment (TBH I don't know any, banks maybe?).
- Finally, changes are applied as part of CD pipeline in a GitOps fashion.
Happy Terraforming :-)