March 15, 2026 · Anton Grishko
Knowledge graphs are the missing piece for AI in your infra
Vector search is great for unstructured docs. Infrastructure isn't unstructured. Here's why a graph beats embeddings for IaC.
TL;DR — Vector search and RAG are great for unstructured docs. Infrastructure isn't unstructured — it's a graph. Modules depend on modules, services on services. We model the IaC repo as nodes and edges so agents can do real transitive walks instead of similarity guesses.
The bet most infra-AI products are making is wrong
The dominant pattern: dump your repo, runbooks, and Confluence into a vector store. When the user asks a question, retrieve the top-k similar chunks and stuff them into the LLM's context. Call it RAG.
This works well for unstructured docs. It fails for infrastructure.
The reason: infrastructure is a graph, not a document. Modules depend on modules. Services depend on services. Network policies declare allowed traffic. IAM roles grant access to specific resources. Helm charts compose values from other charts. When you ask "what breaks if I change shared-vpc.json," the answer is a transitive dependency walk, not a similarity search.
What a graph buys you
Three operations vector search can't do:
1. Transitive dependency. "What modules depend on vpc?" — that's a graph traversal of one or more hops. Vector search returns "modules that mention vpc," which is a different question.
2. Path queries. "How is payments-api connected to cnpg-cluster?" — the answer is a sequence of edges (Service → ServiceMonitor → Prometheus → Alert → Runbook → Database). A graph returns the path. Vector search returns documents containing both names, which usually misses intermediate hops.
3. Symmetry/asymmetry checks. "Is prod-vpc-cidr configured the same way as dev-vpc-cidr?" — a graph compares structured fields. Vector search compares text similarity, which conflates "the same value" with "wording is similar."
The kuberly-graph schema
We model an IaC repo as nodes and edges:
Nodes:
module(name, source, version)
resource(type, name, module)
output(name, value_type, module)
variable(name, type, module)
config_file(path, env)
workload(name, namespace, kind)
Edges:
depends_on (module → module, from terragrunt dependency blocks)
produces (module → output)
consumes (module → output, from input.var = dependency.X.outputs.Y)
configured_by (workload → config_file)
declared_in (resource → module)
The graph builds from:
- Parsing
terragrunt.hclfiles fordependencyblocks →depends_on - Parsing the
inputsblock fordependency.X.outputs.Yreferences →consumes - Parsing Terraform plan JSON for
resourceandoutputdeclarations - Parsing rendered Helm output for
workloadand label-based associations
Refresh on every IaC commit. Cached for ~1 minute on lookup. For the storage and query shape (we use Memgraph and Cypher), see Teaching an Agent to Think in Graphs.
What agents do with it
> show blast_radius for shared-infra.json
shared-infra.json
↳ vpc (consumes)
↳ eks (consumes vpc.outputs)
↳ argo-rollouts (deployed to eks)
↳ external-secrets (deployed to eks)
↳ kuberly-web (deployed to eks)
↳ rds (consumes vpc.outputs)
↳ cnpg-pooler (uses rds endpoint)
↳ payments-api (configured_by cnpg-pooler)
Total downstream resources: 47
Direct downstream modules: 5
Risk: HIGH — VPC change requires VPC reschedule of all EKS nodes
That's a query over the graph. The agent didn't just describe the blast radius — it returned the actual list. The user can act on it. For the broader topology that now spans Terraform code, state, live Kubernetes, ArgoCD, CUE, and docs, see One graph, every source.
When vector search is the right tool
- Unstructured docs: postmortems, runbooks, design docs
- Searching for "the time we had this same issue 6 months ago"
- Onboarding lookup: "where is the deploy guide"
For those, embeddings + RAG are great. We use them.
But for "what does this infra change touch" and "how is X reachable from Y," nothing beats a real graph. That's the difference between an AI that talks about your infra and an AI that reasons about your specific infra.
Further reading
- Memgraph documentation — in-memory graph database, Cypher.
- openCypher specification — the open query language.
- Knowledge graph (Wikipedia) — concept primer.
- Pinecone — vector database explained — when embeddings are the right tool.
- Original RAG paper — Lewis et al., 2020.
- One graph, every source — what kuberly-graph sees now — six sources, one graph.
- Teaching an Agent to Think in Graphs — the agent architecture.
Want a graph-grounded agent on your infra? Talk to us.