Using Terraform Workspaces to Keep Infrastructure Consistent Across Environments

Written by Alex Podobnik | Apr 15, 2026 1:04:34 AM

Infrastructure drift is something we run into constantly when starting a new client engagement. Production looks nothing like staging. Staging looks nothing like dev. Nobody can tell you exactly when or why they diverged. Usually it's a mix of hotfixes applied directly in prod, a "quick change" made through the console, and a staging environment nobody touched for three months.

The fix isn't telling people to be more careful. It's removing the conditions that allow drift in the first place. Terraform workspaces are a big part of how we do that.

What Terraform Workspaces Actually Do

A workspace is just a named state file. That's it. Instead of one terraform.tfstate, you get one per workspace, all managed from the same configuration.

You create and switch between them like this:

terraform workspace new staging
terraform workspace select staging
terraform workspace list

The value of this becomes obvious once you start referencing terraform.workspace inside your config:

locals {
  env = terraform.workspace
  instance_type = {
    dev = "t3.small"
    staging = "t3.medium"
    prod = "m5.large"
  }[local.env]
}

One block, three environments, zero separate codebases to keep in sync.

Why This Works Well for Client Deliverables

When we hand off infrastructure to a client, they need to trust that what's running in production is structurally the same as what they tested in staging. With workspaces, we can back that up. There's one module, one pipeline, and the only legitimate differences between environments are the ones that are explicitly written down.

Changes flow in sequence: dev first, then staging, then prod. If something breaks in dev, it doesn't reach prod. That sounds obvious, but it's remarkable how many teams skip this entirely because their environments live in separate repos that have quietly diverged.

It also makes handoffs cleaner. Instead of giving a client three Terraform projects with subtle differences nobody documented, they get one repository with a clear workspace convention they can actually maintain.

Where State Lives: Terraform Cloud and the Alternatives

Workspaces are only useful if the state backend is solid. We use Terraform Cloud for most engagements. Remote state, state locking, run history, and access controls are all there out of the box. For client work where auditability matters, having every plan and apply logged in one place is genuinely useful.

A couple of alternatives worth knowing about:

Spacelift is worth a look if a client has serious policy requirements or needs drift detection as part of their workflow. Its OPA-based policy layer is more flexible than Terraform Cloud's Sentinel if you're already working in that ecosystem.

env0 makes sense when a client wants cost visibility per environment without building it themselves. The per-workspace cost tracking integrates well into conversations about cloud spend.

Which platform makes sense depends on your team's needs and what you're already working with. The pipeline pattern below works with any of these backends. You're only changing where state is stored, not how anything runs.

The GitLab Pipeline

Nobody on our team runs terraform apply from a local machine against a client environment. Everything goes through GitLab CI/CD. This keeps a full audit trail, enforces the validate/plan/apply sequence, and puts a manual gate in front of every apply.

Here's the pipeline:

image:
  name: hashicorp/terraform:light
  entrypoint:
    - '/usr/bin/env'
    - 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'

variables:
  TF_IN_AUTOMATION: "true"

before_script:
  - export AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
  - export AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
  - terraform init
  - terraform workspace select $WORKSPACE_NAME || terraform workspace new $WORKSPACE_NAME

stages:
  - validate
  - plan
  - apply

validate:
  stage: validate
  script:
    - terraform validate

plan:
  stage: plan
  script:
    - terraform plan -var-file="environments/${WORKSPACE_NAME}.tfvars" -out planfile
  artifacts:
    paths:
      - planfile
  needs:
    - validate

apply:
  stage: apply
  script:
    - terraform apply -input=false planfile
  needs:
    - plan
  when: manual

A few things worth explaining:

WORKSPACE_NAME is a GitLab CI/CD variable we set per branch or environment. The dev branch sets it to dev, main sets it to prod. The pipeline itself never changes.

The select || new line is there so the pipeline is safe on first run. If the workspace doesn't exist yet, it gets created. After that, it just selects. No manual setup required before running the pipeline for the first time.

when: manual on the apply job means plans run automatically on every push, but no infrastructure actually changes until someone clicks the button. For client environments, that button usually belongs to a specific person or role.

TF_IN_AUTOMATION tells Terraform it's running inside a CI system, which suppresses some interactive output and makes log output easier to read.

The planfile artifact carries the exact plan from the plan job into the apply job. Without it, there's a window where state could change between planning and applying, and you'd apply against something different than what you reviewed. The artifact closes that gap.

Keeping Environments Tidy with tfvars

The module structure we use for most client engagements looks like this:

infra/
├── main.tf
├── variables.tf
├── outputs.tf
└── environments/
    ├── dev.tfvars
    ├── staging.tfvars
    └── prod.tfvars

Instance sizes, replica counts, domain names and anything else that legitimately differs between environments goes in the .tfvars files. The workspace handles state isolation. The .tfvars file handles sizing. They do different jobs and neither one bleeds into the other.

What This Actually Fixes

After running this setup across a number of client engagements, a few benefits show up consistently.

Drift stops being a slow-motion problem. Since every environment comes from the same code, the only differences are the ones in .tfvars or explicit workspace conditionals. A manual change someone made through the console gets overwritten on the next apply.

Spinning up a new environment is fast. Adding a QA environment for a specific feature branch means creating a workspace and setting a variable. Not duplicating a project and auditing it for differences.

The audit trail is actually useful. When a client asks what changed on a specific date, the answer is in the GitLab pipeline history with the plan output attached.

Handoffs are simpler. We hand over one repo, documented .tfvars files, and a pipeline the client's team can run without us explaining the whole setup each time.

A Note on Where Workspaces Fall Short

This setup works well for most engagements, but it's worth being honest about where it doesn't. Very large deployments with meaningfully different architectures per environment are better served by separate root modules. If prod has components that simply don't exist in dev, trying to manage that through workspace conditionals gets messy fast. In those cases, the shared module pattern with environment-specific root modules is cleaner.

For the majority of what we build, though, the single-module workspace approach is the right call.

NextLink Labs specializes in platform engineering and DevOps delivery. If you're dealing with environment drift or want a cleaner infrastructure foundation, get in touch at nextlinklabs.com.

View full post