There are many ways to build infrastructure in public using Terraform. In this article, I have attempted to gather the Best Practices that an engineer can adhere to while writing Terraform code. By following these best practices, the resultant codebase can be easily manageable, reusable, and modular.
Table of Contents:
- Defining the folder structure
- Isolate Environments
- Using Terraform Workspaces
- Proper Naming Convention
- Using Shared Modules
- Upgrading to the Latest Version of Terraform
- Backup System State
- Validate and Format Terraform Code
- Lock State File
- Generate documentation with terraform-docs
- Using the self Variable
- Minimize Blast Radius
- Using -var-file flag
Defining the folder structure
It is important to lay down a proper folder structure for our terraform code to get the most out of it. Here we will look at the folder structure we followed in the article Building highly available VMSS on Azure using Terraform Modules. The
modules folder contains the modules within each sub folder named after the name of the
module/function that it would perform. Each module would have a
variables.tf files. The environments folder stores the environment-specific configuration. In this example, we have a dev, stage and prod environment, each with a
<environment-name>.tfvars file to store the override values. Finally, in the root module, we have the
variables.tf files. The folder structure would look something like the one given below:
For this example, we will use the folder called
modules as a container for all the separate modules we will use throughout this article. The folder named
vnet will be the place we’ll use to store the
variables.tf files which hold the configuration for the
vnet that the rest of the resources will use. Likewise, a
subnet folder will be hosting similar files to our
vnet folder. The
subnet module will create the subnet(s) that the resources will use. Finally, we have the
vmss folder that holds the configuration for the Azure Virtual Machine Scale Set. Within it, we create the required resources like public IP, load balancers, an address pool, lb health check probe and rules to expose the ports on the load balancer. The environments folder has the folders corresponding to the environment names; we will use the
<environment-name>.tfvars files to create the resources based on the environment in question. The environment configuration in the
tfvars files in the
environment folders will be used in the commands like
apply to override the
terraform.tfvars stored in the root module. The
backend.tf defines our cloud provider for the project. Since we are using Microsoft Azure for this illustration, our provider name will be
azurerm. We will be using the
required_providers to define the proper versions.
Sometimes, developers like to create a security group and share it with all non-prod (dev/QA) environments. It is better to create resources with different names for each environment and each resource.
With that, we will quickly define the resource with a meaningful and unique name, and we can build more of the same application stack for different developers without changing a lot. For example, we update the environment to dev, staging, QA, prod, etc.
Some cloud providers have resource names with length limits, such as less than 24 characters, so it is better to use a short name when defining variables for our application and environment name.
Using Terraform Workspaces
Workspaces each have state files, and as such, they provide isolation between them. When working in one workspace, changes will not affect resources in another workspace. When managing separate environments and large deployments, this separation is critical for peace of mind.
What is a Terraform Workspace?
We can think of workspaces as a layer of isolation for Terraform state files. Every workspace has its state file. All modifications applied in a particular workspace will never affect resources managed by another. Workspaces are the key to managing multiple independent environments using a single Terraform project.
We can use the current workflow’s name in our configuration files by using the
terraform.workspace variable when using workspaces in Terraform.
A fairly common use case is to create scoped names for resources. For example, consider using Terraform to provision an Azure Resource Group. Azure enforces a unique name constraint on Resource Groups in the context of your Azure subscription. (We can have only one Resource Group with the name cool-product). By appending the name of the current workspace (e.g. dev, test, or prod), we enforce the unique name constraint.
Terraform Workspaces in Practice
Creating Workspaces for environments
Every Terraform project comes with a workspace out of the box. When we execute the
terraform init, Terraform implicitly creates a new workspace. The name of this workspace is always
default. However, we can quickly create a custom workspace using the terraform workspace the
Creating a workspace from an existing state file
You can bring in workspaces also for existing Terraform projects. For example, if we already have a state file, use the state option to copy the existing state into the new workspace.
Display the current workspace
To know which workspace we are currently interacting with, execute
terraform workspace show. Terraform will quickly print the name of the current workspace.
List all available workspaces
We can list the available workspaces using the
How to switch workspaces
Switching to a different namespace is super easy in Terraform. Just execute the
terraform workspace select command, followed by the name of the desired workspace.
Remove a workspace
Sometimes, we want to remove a specific workspace. The sub-command
delete is responsible for that. But, first, delete the staging environment and check the list of remaining workspaces. There is only one exception; we can not delete the
Proper Naming Convention
Naming conventions are used in Terraform to make things easily understandable.
For example, let’s say we want to make three different workspaces for different environments in a project. So, rather than naming them env1, en2, or env3, we should call them a dev, stage, or prod. From the name, it becomes clear that three different workspaces exist for each environment.
A proper naming convention should be followed for resources, variables, modules, etc. For example, the resource name in Terraform should start with a provider name followed by an underscore and other details.
For example, the resource name for creating a terraform object for a Public IP in Azure would be
So, if we follow the naming conventions right, it will be easier to understand even complex codes.
Using Shared Modules
Using the official Terraform modules available is strongly suggested. No need to reinvent a module that already exists. It saves a lot of time and pain. Terraform registry has plenty of modules readily available. Make changes to the existing modules as per the need.
Also, each module should concentrate on only one aspect of the infrastructure, such as creating an Azure Virtual Machine instance, setting MySQL database, etc.
For example, if you want to use Azure VNet in your terraform code, we can use:
Upgrading to the Latest Version of Terraform
The Terraform development community is active, and new functionalities are released frequently. Therefore, staying on the latest version of Terraform is recommended when a new major release happens. You can easily upgrade to the latest version.
If you skip multiple major releases, upgrading will become very complex.
terraform -v command to check for a new update.
Your version of Terraform is out of date! The latest version is 0.12.0. You can update by downloading Terraform
Backup System State
Always backup the state files of Terraform.
These files keep track of the metadata and resources of the infrastructure. By default, these files called
terraform.tfstate are stored locally inside the workspace directory.
Terraform cannot determine the resources deployed on the infrastructure without these files. So, it is essential to have a backup of the state file. So, the
terraform.tfstate.backup will be created to keep a backup of the state file.
If you want to store a backup state file in another location, use the
-backup flag in the terraform command and give the location path.
Most of the time, multiple developers will work on a project. So, to give them access to the state file, it should be stored at a remote location using a
terraform_remote_state data source.
The following example will take a backup to the Storage Account.
Validate and Format Terraform Code
Consistently execute the
terraform fmt command to format Terraform configuration files and make them neat.
We can use the code below in a Travis CI pipeline (you can re-use it in any pipelines) to validate and format check the code before we can merge it with the
Lock State File
There can be multiple scenarios where more than one developer tries to run the terraform configuration simultaneously. Such a scenario can lead to the corruption of the terraform state file or even data loss. The locking mechanism helps to prevent such scenarios. It makes sure that at a time, only one person is running the terraform configurations, and there is no conflict.
Here is an example of locking the state file at a remote location in Azure.
When multiple users try to access the state file, Azure Storage Account will be used for state locking and maintaining consistency.
Note: not all backend support locking.
Generate documentation with terraform-docs
We need not manually manage the usage of input variables and outputs. Instead, a tool named terraform-docs can do the job for us.
Currently, the original terraform-docs don’t support Terraform
0.12+. Follow this Github issue for updating.
Now we have a workaround.
For details on how to run terraform-docs, check this repository.
self variable is a special variable used when you don’t know the variable’s value before deploying an infrastructure.
Let’s say you want to use the IP address of an instance which will be deployed only after the
terraform apply command, so you don’t know the IP address until it is up and running.
In such cases, you use self variables, and the syntax to use it is
self.ATTRIBUTE. So, in this case, you will use
self.ipv4_address as a self variable to get the IP address of the instance.
These variables are only allowed on connection and provisioner blocks of terraform configuration.
Minimize Blast Radius
The blast radius is nothing but the measure of damage that can happen if things do not go as planned.
For example, what will be the amount of damage to the infrastructure if you are deploying some terraform configurations on the infrastructure and the configuration does not get applied correctly.
To minimize the blast radius, I suggest pushing a few configurations on the infrastructure at a time. So, if something goes wrong, the damage to the infrastructure will be minimal and can be corrected quickly. However, deploying plenty of configurations at once is very risky.
Next, let’s run a
plan of our infrastructure to see the final Terraform plan once we apply the code. There, however, is an extra variable we need to pass in called the
-var-file. This argument takes the tfvars file based on the environment we selected:
Finally, let’s apply the infrastructure using the apply sub-command: