Engineering notes
Platform Engineering · notes
Original deep-dives on platforms, Kubernetes, cloud, SRE, security and FinOps.
-
Multi-cluster ArgoCD: hub-and-spoke vs per-cluster and where both need pull-mode
Hub-and-spoke is the default, but it breaks against blast radius, inbound networking and compliance. When to pick per-cluster local, when pull-mode, and what a hybrid without ideology looks like.
Read -
Policy as Code in the AI-code era: the speed-versus-control paradox
Production-code velocity now outruns human review. Without deterministic guardrails at CI and admission, an AI agent turns into an amplifier of mistakes — not throughput.
Read -
The golden path as a product: why the thinnest viable platform beats the five-year build
Build the platform thin — one path at a time — and standardize the process before you automate it. Otherwise you ship a monument no one uses.
Read -
Internal Developer Platform: Six Layers and Three Adoption Drivers
An IDP isn't "yet another portal" — it's a product for developers. Why platform engineering took off now, the six layers a working Kubernetes platform is built from, and the four mistakes that kill a portal in six months.
Read
-
Kubernetes 1.36 (Haru): What Actually Changes in Production
Mutating webhooks have started dying, Ingress NGINX is retired, HPA scale-to-zero is still alpha. A pragmatic 1.36 triage for the platform team — without the blogosphere hype.
Read -
A Kubernetes Debugging Agent: Query Templates or Scripts?
Zinchenko hands the LLM MetricsQL templates; my VM skill feeds the agent a finished aggregate. I dissect the flexibility-versus-reproducibility axis and why read-only by blacklist is weaker than an allowlist.
Read
-
OIDC → AWS STS: CI/CD Without Long-Lived Keys
Federated identity replaces AWS_ACCESS_KEY_ID in CI/CD: one pattern for GitHub, GitLab and Atlantis — no rotation, real CloudTrail attribution.
Read
-
Four golden signals: what they actually catch and why the stack is VictoriaMetrics + Loki
What each of the four signals really catches, and three traps where «we have monitoring» turns out to be green checkmarks above a broken service.
Read -
Error Budget as a Stop Button: SLOs Without Panic
Error budget turns reliability into a resource you can spend — and multi-burn-rate alerts turn it into a page that's actually worth waking up for.
Read
-
DevSecOps in five stages: from secret-scan to admission policy
Five CI stages with exit-code 1 plus a cluster-side admission gate — the only pattern under which DevSecOps actually blocks production instead of running as a green-checkmark ritual.
Read
-
Three levers against Kubernetes overspend: idle, scale-to-zero, right-sizing
The average cluster runs at 20–30% utilization. You pay for the rest. Three levers that recover the money without rewriting your architecture.
Read -
FinOps Is Rationality, Not Cost-Cutting
Cloud overspend isn’t negligence — it’s the sum of locally rational decisions. Change the information environment, not the people.
Read
-
AI rewrites DevOps: from syntax to judgment
What AI already does better than an engineer, where the human’s value remains, and how the senior DevOps role shifts by 2026.
Read -
A Local LLM for Coding: Not «Which» but «Where»
A review of the practical method for picking a local coding model by hardware — and where local models have a real niche while only the cloud holds the line on precision.
Read -
The 2026 Terminal Stack Through a Linux User's Eyes: What's Portable, What's MacBook Optics
The "ultimate 2026 stack" guide is written for macOS. I review it from a Linux/Wayland chair: Ghostty versus Konsole, a strong backend, and the gaps you must fill.
Read -
Agentic AI Security: Porting Enterprise Patterns Down to a Solo Harness
Biswas's architecture handles the identity layer well via OAuth and token exchange — but the real danger lives on the intent layer, which the article never reaches.
Read -
Saving Tokens in Agents: What the Survey Gets Right and What It Over-sells
A map of four token-cost techniques — caching, lazy tools, routing, compaction. It buries the cache's real consequence in a footnote, while two counter-intuitive facts are worth the whole map.
Read -
The Short Memory File Collapses at Scale: Tokens Versus Rule Precedence
Keeping the memory file tiny is right for a solo repository. But compliance rules from several organizations cannot be pinned to paths — they must be routed.
Read -
An Obsidian Second Brain Is a Layer, Not the Agent's Whole Memory
Monteiro's zoned wiki is the right mechanism, but agent memory is a taxonomy of layers; and any layer that ingests the outside world must be scanned for injection before synthesis.
Read -
CLAUDE.md: Contract or Layers? Where the Viral File Is Right and Where It Stops
The viral CLAUDE.md is right for the median: start flat. But compliance cannot be asked for — it must be mechanized. On what to write in the file and what to lift into the harness.
Read -
The Harness Is the Cheap Part: Notes on Rebuilding Claude Code
A commentary on Fareed Khan's from-scratch rebuild of Claude Code — and why its central conclusion should be flipped.
Read -
Claude Code Is a Skill: Notes on Leo Godin's Argument
Agreeing with the "LLMs are a skill" thesis and adding the experience the original left out.
Read -
Should you still learn to code in 2026 — notes on Marina Wyss
A walk-through of her Medium piece: where she is right, where she over-simplifies, and what I would add.
Read