Kubernetes 1.35 introduces an alpha ‘Restart All Containers’ capability that makes a whole‑Pod refresh a first‑class operation. Here’s where it helps, where it can hurt, and how to roll it out safely.
Kubernetes keeps expanding its surface area—CRDs, admission policies, Gateway API, and now inference-focused extensions. SIG Architecture’s API Governance work is the quiet mechanism that keeps innovation moving without breaking users. Here’s what ‘API governance’ means in practice, and how platform teams can adopt the same discipline internally.
EKS Capabilities package Argo CD, AWS Controllers for Kubernetes (ACK), and Kube Resource Orchestrator (kro) as managed, Kubernetes-native building blocks. Here’s what changes when platform teams can compose AWS resources and Kubernetes resources behind custom APIs — without running the controllers themselves.
AWS is packaging common platform components (GitOps and infrastructure orchestration) as managed, Kubernetes-native ‘capabilities’ for Amazon EKS. Here’s what it changes for day-2 ops, how it compares to rolling your own controllers, and what to watch before you standardize on it.
Harbor is easy to install, hard to productionize. Here’s a practical checklist for HA, storage, signing/scanning, and day-2 ops when Harbor becomes your cluster’s artifact backbone.
Kubernetes v1.35 continues a trend: clusters are increasingly asked to run mixed AI workloads (training, batch, and latency-sensitive inference) alongside traditional services. Here’s what’s new that matters for platform teams—especially around scheduling, resizing, and safer config workflows.
AWS published a reference controller that connects Amazon Application Recovery Controller (ARC) zonal shifts to Karpenter node pools. Here’s what the integration changes operationally, how it works under the hood, and how to adopt it safely in production EKS.
AWS shows how to wire Amazon Application Recovery Controller’s zonal shift signals into Karpenter so clusters stop provisioning into a degraded AZ. Here’s why it matters, how it works, and what platform teams should standardize.
Helm v4.1.1 is a patch release, but it’s a good excuse to revisit how chart supply chains, plugin sprawl, and CI-driven upgrades actually break production. Here’s a pragmatic operator playbook.
Kubernetes’ new Node Readiness Controller proposes a more realistic model for node health—one that reflects the dependencies modern clusters rely on. Here’s what it is, why it matters, and how to plan adoption without breaking workloads.
Kubernetes v1.35 is a reminder that runtimes are part of the platform contract: it’s the last Kubernetes release to support containerd v1.x. Here’s a pragmatic, low-drama way to plan the move to containerd 2.0+ without turning node upgrades into incident response.
Kubernetes shipped same-day patch releases across four supported branches plus a new v1.36.0 alpha. Here’s how to turn ‘release day’ into a repeatable upgrade workflow: risk triage, conformance gates, and rollback-ready rollouts.
Kubernetes’ Node Ready condition is a blunt instrument. The new Node Readiness Controller adds declarative, taint-based readiness gates so nodes only enter the scheduling pool when platform-specific dependencies (CNI, storage, GPU drivers, local agents) are truly healthy.
Kubernetes SIG Network is retiring the ubiquitous Ingress NGINX controller in March 2026. Here’s how to inventory impact, choose a replacement, and migrate safely—ideally to Gateway API—without breaking traffic.
Kubernetes’ new Node Readiness Controller proposes a more nuanced readiness model that reflects real dependency chains (network, storage, security agents). Here’s what it changes and how platform teams can operationalize it.
ingress-nginx is heading into retirement in 2026. Here’s a practical, low-drama playbook to inventory your current usage, choose a target (Ingress controller vs Gateway API), and migrate with controlled risk.
Kubernetes has long treated node readiness as a single binary signal, but modern nodes depend on a stack of agents (CNI, CSI, GPU, security) that fail independently. The new Node Readiness Controller introduces a more expressive model—here’s what it changes, how to adopt it, and what to watch for in your SLOs.
Multiple fresh ingress-nginx CVEs are forcing teams to re-check a long-assumed ‘safe default’: the ingress controller. Here’s what the advisory says, what’s exploitable in real deployments, and a pragmatic patch + mitigation plan you can execute today.
A new ingress-nginx advisory discloses multiple CVEs. Here’s how to triage impact, patch safely, and reduce blast radius with practical hardening steps.
A new Node Readiness Controller proposal reframes node health as a set of dependency-aware readiness signals—making scheduling and remediation more precise than the classic Ready/NotReady binary.