Kubernetes Dashboard Retires, etcd Optimization at Scale, and the Platform’s Expanding Workload Frontier

The End of an Era: Kubernetes Dashboard Steps Aside for Headlamp

For many practitioners, Kubernetes Dashboard was their first visual window into the world of container orchestration. It offered a simple, approachable way to inspect pods, deployments, and services without memorizing kubectl commands. For years, it served as the de facto onramp for developers, students, and operators who wanted to understand what was running in their clusters.

Now the project has been archived. The Kubernetes community is not abandoning the idea of a visual interface — it is handing the baton to Headlamp, a modern UI that builds on Dashboard’s legacy while addressing the realities of multi-cluster, application-centric operations.

The transition is designed to feel familiar. Headlamp preserves the core workflows that Dashboard users know: browsing workloads, editing manifests, scaling deployments, and inspecting resources. All actions respect standard Kubernetes RBAC, so existing access controls carry over without change. Where Headlamp diverges is in its scope. It is built for teams managing multiple clusters, not just one. The multi-cluster view lets operators move between development, staging, and production environments without switching tools or losing context.

Headlamp also introduces Projects, an application-centered way to view Kubernetes. Instead of jumping between resource lists, teams can group related workloads, services, and configurations in one place. Projects are built on native Kubernetes concepts — namespaces, labels, and RBAC — so nothing fundamentally changes under the hood. The UI simply adds a visual layer that brings related resources together.

Extensibility is another major differentiator. Headlamp supports plugins that bring common workflows directly into the interface. The Flux plugin, for example, surfaces GitOps state alongside the Kubernetes resources that Flux manages. An AI Assistant adds a conversational layer for troubleshooting and understanding cluster state. Platform teams can also build custom plugins to integrate internal tooling without fragmenting the user experience.

Headlamp runs both in-cluster and as a desktop application, with options that are not mutually exclusive. Many teams use the desktop app for day-to-day work while relying on an in-cluster deployment for shared production environments. The migration guide from the Kubernetes blog is straightforward, and a step-by-step walkthrough is coming soon.

Enterprise etcd: The Hidden Bottleneck at Massive Scale

While the community debates UIs, a quieter but more critical challenge is playing out in the engine room of large Kubernetes environments: etcd performance under extreme load.

At the OpenShift Commons Gathering in Amsterdam — held as a Day Zero event for KubeCon + CloudNativeCon Europe 2026 — Emirhan Bilge Bulut, an Expert System Engineer at Garanti BBVA, shared the scale of the challenge. The Turkish bank operates 60 Red Hat OpenShift clusters, including 33 production environments, supporting 30 million customers and processing up to 2 billion transactions per day during peak times. Across all clusters, the bank runs over 30,000 pods and 3,100 services.

At this scale, etcd became the chokepoint. The team observed uncontrolled database size growth, particularly in non-production environments where developer activity is high. A single cluster might house 40,000 pods and 10,000 microservices. Any etcd performance degradation led to high API latency, which cascaded into systemwide reconciliation backlogs and pod scheduling delays.

Standard optimizations like automated defragmentation and history compaction helped temporarily, but the root causes persisted. The team identified three major contributors:

  • Unrestricted revision history: Deployment objects lacked revision limits, causing historical data to accumulate. Configuring a limit of one previous revision stopped the pile-up.
  • Secret proliferation: Over 20,000 unnecessary secrets were discovered in etcd. Using capabilities introduced in OpenShift 4.11, the team cleaned up legacy service account tokens no longer required for storage.
  • Duplicated ConfigMaps: CI/CD pipelines were creating redundant configurations across namespaces. Consolidating these into common ConfigMaps significantly reduced the data footprint.

When existing open source tools proved too resource-intensive or limited in scanning custom resource definitions, Garanti BBVA built its own lightweight cleanup tool. It interacts directly with the OpenShift REST API rather than standard Python libraries, cutting processing time from 30 minutes to 4 minutes. The tool also includes a unique estimation mechanism that decodes etcd data to calculate exact space reclamation before any deletion occurs.

The results were substantial: 1.5 to 2 GB of free space reclaimed in non-production environments, with the process now fully automated and repeatable. As Bulut noted, “In large systems, problems are not always solved by adding more resources. They are solved by understanding the system deeply.”

The timing is notable. etcd v3.8.0-alpha.0 and v3.7.0-rc.0 were both tagged in early June, signaling continued investment in the datastore that underpins every Kubernetes cluster. For operators managing large fleets, the Garanti BBVA case study is a reminder that etcd tuning is not a one-time task — it is an ongoing discipline.

Data Workloads on Kubernetes: EKS Gets Serious About OLAP

Kubernetes was originally built for stateless microservices. That assumption is changing fast. In late May, the Amazon WW Stores FinTech team published a detailed architecture for running StarRocks — an open-source OLAP engine — on Amazon EKS, using KEDA and Karpenter for elastic scaling.

The motivation was clear: financial analytics at enterprise scale demands sub-second to single-digit-second query responses across terabytes of data, while supporting hundreds of concurrent users during peak business cycles. Standard systems could satisfy one or two of these dimensions, but not all three simultaneously.

The architecture they built is instructive. StarRocks runs on EKS with KEDA handling event-driven autoscaling based on actual query load, and Karpenter provisioning nodes dynamically to match resource demands. The team partnered with the Data on EKS initiative on reference blueprints that other organizations can adopt.

This is not a niche use case. It represents a broader shift in how Kubernetes is being used. The platform is no longer just for serving APIs and running CI/CD pipelines. It is becoming a legitimate host for data-intensive workloads that previously required dedicated infrastructure.

Runtime Security: containerd Patches CVE-2026-46680

Security in the container runtime layer remains a critical concern. On June 2, the containerd project released version 2.1.8, which includes a patch for CVE-2026-46680. The release also fixes handling of out-of-range USER values in OCI specs to prevent unexpected username and group lookups, resolves sandbox service bugs affecting creation configuration and event publishing, and adds conditional AppArmor ABI support for versions older than 3.0.

For operators, the message is simple: patch your runtimes. The containerd 2.1 release line is actively maintained, and security patches are being backported regularly. Runtime security is not glamorous, but it is the foundation everything else rests on.

Tooling Updates: Helm v4.2.0 and the Kubernetes 1.36 Client

Helm, the package manager for Kubernetes, shipped v4.2.0 in mid-May. The release bumps Kubernetes client libraries to v1.36, adds a new mustToToml template function, and switches release builds to goreleaser. Two flags — --hide-notes and --render-subchart-notes — have been deprecated. A notable fix: --dry-run=server now respects generateName, closing a long-standing inconsistency in how Helm handles generated names during server-side dry runs.

The release is available across 12 platform targets, including riscv64 and loong64, reflecting the expanding hardware landscape that Kubernetes tooling must support.

Google Cloud’s Answer: GKE Standby Buffers and AI-Optimized Storage

While AWS doubles down on data workloads, Google Cloud is attacking a different pain point: node startup latency. The company introduced GKE standby buffers, a feature designed to improve autoscaling speed without inflating cloud spend. The idea is to keep a pool of warmed nodes ready to accept workloads, reducing the cold-start delay that plagues bursty applications.

Google also expanded its Cloud Storage FUSE profiles for GKE, adding pre-configured settings optimized for AI and ML workloads. The new profiles remove the guesswork from configuring storage for training pipelines and inference jobs. Complementing this, the GKE Inference Gateway now supports running both real-time and asynchronous inference on the same infrastructure, simplifying operations for teams serving multiple model types.

Another notable addition is Dynamic Resource Allocation (DRA), a new Kubernetes device management approach that Google is championing. DRA moves beyond the static device plugin model, allowing more flexible allocation of GPUs, TPUs, and other accelerators based on workload requirements rather than pre-defined resource types.

What This Means for Operators

The Kubernetes ecosystem is maturing in three distinct directions simultaneously:

  • Usability: The Dashboard-to-Headlamp transition shows the community investing in interfaces that match how teams actually work — across clusters, across applications, with extensibility built in.
  • Scale: The Garanti BBVA etcd story is a masterclass in root-cause analysis. It reminds us that the most critical component in a Kubernetes cluster is often the one we pay the least attention to until it breaks.
  • Workload diversity: From StarRocks OLAP on EKS to AI-optimized storage on GKE, Kubernetes is no longer just for web services. The platform is being stretched into data engineering, analytics, and AI inference — and the tooling is adapting.

The common thread is that Kubernetes is becoming a platform for serious infrastructure, not just container orchestration. The teams that thrive will be the ones that treat etcd tuning, runtime patching, and workload-specific optimization as first-class concerns — not afterthoughts.

Sources