Most "DevSecOps pipelines" I audit break in the same spot: the checks are there, but they run warn-and-continue. The scanner finds a CRITICAL; the pipeline goes green. Six months later it isn't a pipeline anymore — it's a ritual.
For a security pipeline to actually block production, you need two things: five stages, each with a policy-driven exit-code 1, and a second gate inside the cluster — at admission. One CI gate without admission verification is a false sense of security: CI catches what you built; the cluster accepts what you deployed. Those two sets overlap, but they don't coincide.
Five stages, and why exactly five
Fewer than five leaves a hole; more than five stops being a pipeline and becomes a separate program.
1. Secret scan gitleaks / trufflehog / detect-secrets fail on any match
2. SAST Semgrep / SonarQube / CodeQL fail on new HIGH/CRITICAL
3. Dependency scan Trivy fs / Snyk / OWASP DC fail on CVSS ≥ 7 without VEX
4. Container scan Trivy image / Grype fail on HIGH/CRITICAL CVE without fix
5. Manifest / IaC Trivy config / Checkov / Conftest fail on security misconfig
The logic is shift-left: the earlier you catch the defect, the cheaper it is. Secrets caught before they enter git history. SAST before the code is built. Dependency scan before a vulnerable package lands in an image. Image scan before push to the registry. Manifest scan before kubectl apply reaches the cluster.
Each stage is its own bucket of ownership. Blurring them into "one big scanner" is the classic mistake: when everything is Trivy, no one owns SAST. When everything is SonarQube, no one owns the CVE database. Different teams, different bugs.
Stage 1: secrets, and not only in git
The git-level baseline is gitleaks in a pre-commit hook plus detect-secrets with a baseline in CI. The baseline matters: without it, every legitimate AKIA* in a test breaks the build.
There is a second layer that almost everyone misses — secrets baked into image layers. RUN echo $TOKEN > .npmrc && ... && rm .npmrc leaves the token in the immutable layer below; rm only erases the top view. docker history --no-trunc or just docker save | strings | grep token pulls it back out of the registry. The fix is BuildKit secrets:
RUN --mount=type=secret,id=npmrc cp /run/secrets/npmrc ~/.npmrc && npm ci
And in the pipeline — trivy image --scanners secret on top of git-level gitleaks. Two different scanners, two different attack surfaces.
Stage 3–4: dependency and container are not the same thing
trivy fs scans package-lock.json, go.sum, requirements.txt — what lives in the code. trivy image scans the built image — that includes everything the base layer dragged in (apt-get install) plus whatever pip install added on top. A CVE in the base image's libc won't show up under trivy fs. A CVE in a dev dependency that never ships into the final image won't show up under trivy image.
The CI canon:
trivy image --severity HIGH,CRITICAL --exit-code 1 --ignore-unfixed $IMAGE
trivy fs --severity HIGH,CRITICAL --exit-code 1 .
trivy config --severity HIGH,CRITICAL --exit-code 1 ./k8s/
--ignore-unfixed is pragmatic: while upstream hasn't shipped a fix, blocking the build buys nothing. But it turns .trivyignore into technical debt — every suppressed CVE needs a why comment and a JIRA ticket. A year later, without that discipline, it's a landfill.
Stage 5 + admission: the only pattern that holds
Stage 5 in CI — trivy config / Checkov / Conftest on manifests and terraform plan -json. It catches what's in the repository. It does not catch what was deployed around CI or an image whose tag was overwritten — myapp:v1.2.3 in the registry could have been rebuilt today with no scans at all.
So the second gate runs at admission:
- Kyverno verifyImages + cosign — the image is admitted only if it was signed by CI after passing scans. A build with no signature → admission deny.
- OPA Gatekeeper / registry whitelist — pulls only from trusted registries (ECR, Harbor, private GHCR). An image from a public
docker.io/random/xis rejected. - ValidatingAdmissionPolicy (CEL) — built into the API server since K8s 1.36, no webhook needed. Good for simple policies: no
:latest, no privileged, required labels. Complex things (image verification, multi-resource generation) stay on Kyverno.
Duplication? Deliberate. CI scan catches CVEs at build time; admission verify catches deploys of images that bypassed CI. Two layers, each covering the other's failure mode.
IaC: a pre-apply gate as the last barrier
Terraform is a category of its own: a single terraform apply creates dozens of resources at once. The pre-apply gate:
terraform plan -out plan.binary
terraform show -json plan.binary > plan.json
conftest test plan.json --policy ./policies/ # OPA on the plan, not the code
Rego rules: deny cidr_blocks = ["0.0.0.0/0"] on 22/3389; deny Action: "*" in IAM; deny s3_bucket without server_side_encryption_configuration. Cheaper to fail the PR than to apply and revert.
A structural safety net underneath all of that — an IAM Deny without the ManagedBy=Terraform tag at the account level: an engineer in the console physically cannot change an IaC-managed resource. Drift moves from possible to impossible.
What people forget to wire up
exit-code 1everywhere. "Soft-fail" in CI means "no CI". A Sonar quality gate in failure mode (not warning) is mandatory..trivyignorereviewed in PR. Suppression without justification is the cheapest way to disarm the whole pipeline.trivy db --skip-db-updatein CI is an antipattern. The CVE DB updates ~3× per day; yesterday's is a different database.- Audit log on admission deny. Without it, the admission gate works silently and nobody on the team sees that Kyverno blocked three deploys yesterday.
The real shift is to stop thinking of DevSecOps as "configure the scanner in CI". It is two layers of policy applied to the same risks: shift-left in CI plus admission-time enforcement in the cluster. One without the other is either a green-checkmark ritual or an open kubeconfig for any image that walked around the pipeline.