Terragrunt at scale — live/env/region/stack hierarchy with remote-state refs to shared modules.

April 10, 2026 · Anton Grishko

Why we ship Terragrunt, not raw Terraform

Terraform without Terragrunt at scale is copy-paste with extra steps. Here's what Terragrunt adds and where it bites.

TL;DR — Raw Terraform at scale is copy-paste with extra steps. Terragrunt gives you DRY remote-state configs, an explicit dependency graph, run-all orchestration, and composable inputs from JSON. Every Kuberly customer gets the layout below — and they own it.

What Terragrunt actually solves

The pitch is "DRY Terraform" but that undersells it. The four real wins:

1. One backend config, applied everywhere. With raw Terraform you write terraform { backend "s3" { ... } } in every module. Twenty modules = twenty places to update when your bucket changes. With Terragrunt's remote_state block in root.hcl:

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    bucket = local.aws_env.state_bucket
    key    = "${path_relative_to_include()}/terraform.tfstate"
    region = local.aws_env.region
    use_lockfile = true
  }
}

Every child terragrunt.hcl includes root and inherits backend automatically. Twenty modules, one source of truth.

2. Real dependency graph. Terragrunt's dependency blocks let one module read another's outputs at plan time without a remote terraform_remote_state data source. The dependency is explicit, the graph is visible, and run-all plan walks it in topological order:

dependency "vpc" {
  config_path = "../vpc"
}

inputs = {
  vpc_id          = dependency.vpc.outputs.vpc_id
  private_subnets = dependency.vpc.outputs.private_subnets
}

Add a new module that depends on the VPC. Terragrunt knows. run-all apply does the right thing.

3. Composable inputs from JSON. This is what makes the Kuberly pattern work. Each cluster has a components/<env>/*.json directory. Every Terragrunt module reads it via read_terragrunt_config:

components = {
  for file in fileset(local.components_dir, "*.json") :
  basename(file, ".json") => jsondecode(file("${local.components_dir}/${file}"))
}

So a single JSON file per environment configures every module: ECR, secrets, buildprojects, EKS settings. No HCL for config, just data.

4. run-all for batch operations. terragrunt run-all plan against the whole stack walks the dependency graph and produces one big plan. Useful for code review on a huge change. Less useful in CI (you usually want per-module CI) but the UX is nice. (For what changed in 1.0, see Terragrunt 1.0 — what changed.)

Where it bites

The dependency block is greedy. It runs the dependency module's terraform init every time you plan the dependent. On a slow network you feel it. Mitigation: use --terragrunt-fetch-dependency-output-from-state to read from S3 directly.
The locals/inputs split. Terragrunt has both locals (HCL-only) and inputs (passed to Terraform). Where to put a derived value isn't always obvious. Habit it as: locals for transformation, inputs for the result.
Error messages from before_hook/after_hook failures are cryptic. When a hook fails the actual stderr is buried two levels deep. Always tail the logs.

The pattern we ship

Every Kuberly customer gets this layout:

infrastructure/
├── root.hcl                  # backend, env vars, components loader
├── components/
│   ├── prod/                 # one JSON per concern
│   └── dev/
└── clouds/aws/modules/
    ├── vpc/
    │   └── terragrunt.hcl    # reads root, defines inputs from components
    ├── eks/
    ├── ecr/
    └── ...

The customer reads it. Branches it. Modifies it. We commit on PRs and the autopilot applies on merge. Standard tooling, no proprietary DSL, no bespoke abstractions.

That's what "You own the IaC" means in practice.

Why we ship Terragrunt, not raw Terraform

What Terragrunt actually solves

Where it bites

The pattern we ship

Further reading