May 2026 has been a banner month for the Cloud Native Computing Foundation ecosystem. Three CNCF projects — Kyverno, Microcks, and Fluid — each reached significant milestones that together tell a larger story: cloud native infrastructure is maturing to meet the dual demands of AI-scale workloads and production-grade security. From policy engines patching critical vulnerabilities to data orchestration layers shaving 42 minutes off LLM cold starts, the CNCF stack is proving it can handle the next generation of compute.
Kyverno 1.18: First Post-Graduation Release Doubles Down on Security
In March 2026, Kyverno graduated from the CNCF, joining the ranks of Kubernetes, Prometheus, and Envoy as a mature, production-ready project. Its first release since that milestone — Kyverno 1.18, announced on May 5 — makes clear the maintainers are not resting on their laurels. The release is heavily focused on security hardening, CLI expansion, and policy engine reliability.
The headline security improvements address two CVEs that could have allowed policy misuse. CVE-2026-4789 is an SSRF vulnerability in Kyverno’s HTTP CEL library. In 1.18, unsafe addresses like loopback and metadata services are blocked by default, and HTTP calls from namespaced policies are disabled unless explicitly enabled. CVE-2026-41323 fixed a token-scoping issue where HTTP calls included a token that could be used to impersonate Kyverno controllers. The fix introduces scoped tokens that prevent servers from misusing them.
Beyond security patches, Kyverno 1.18 brings meaningful operational improvements. Memory-based HPA autoscaling for the admission controller helps the engine scale under load. TLS support on the /metrics endpoint improves observability security. Fine-grained success event filtering via a new successEventActions ConfigMap parameter lets operators tune event volume in large clusters. Image verification also gets better, with namespace-scoped registry credentials and fixes for Notary resolver reliability.
The CLI gains support for cleanup policies, HTTP and Envoy authorization policies, and mutateExisting rules in MutatingPolicy — making it easier to test modern policy types locally and in CI pipelines. Importantly, there are no breaking changes, though ClusterPolicy deprecation remains on track as Kyverno continues its transition toward CEL-based policy types.
Microcks Becomes a CNCF Incubating Project
On May 7, the CNCF Technical Oversight Committee voted to accept Microcks as an incubating project. For a tool that started as a side project in 2015, this is a significant validation of both its technical approach and its growing community.
Microcks is an open source, cloud native platform for API mocking and testing. Its key differentiator is multi-protocol support: it can turn OpenAPI specs, AsyncAPI specs, gRPC/Protobuf definitions, GraphQL schemas, Postman collections, and SOAP/WSDL projects into live mock servers. The same assets then power automated contract conformance tests against real implementations. This unified approach spans both synchronous REST/RPC APIs and event-driven, asynchronous architectures.
The project’s growth metrics are impressive. Container image downloads exceeded 2.5 million in 2025 — triple the 2024 total. Over 34 organizations publicly adopt Microcks, including BNP Paribas, Société Générale, Deloitte, and Amway. The contributor base has reached 645 people across GitHub, with 167 active contributors in 2025 representing 35 organizations. Development health is strong: the project was active 342 of the last 365 days, with an average issue resolution time of 11 days and PR merge lead time of 6 days.
Microcks has deepened its CNCF ecosystem integrations since joining the sandbox in 2023. It now connects with Dapr, OpenTelemetry, Keycloak, AsyncAPI, Kubernetes, Helm, and Tekton. Testcontainers modules for Java, Node.js, Go, Python, and .NET let developers embed Microcks in local test loops.
Looking ahead, the maintainers have signaled interest in intelligent mocking for AI agents and MCP (Model Context Protocol) support — a clear bet that API testing will become even more critical as AI systems increasingly interact via APIs rather than UIs.
NetEase Games + Fluid: Cutting LLM Cold Starts from 42 Minutes to 30 Seconds
Perhaps the most concrete demonstration of cloud native infrastructure meeting AI reality comes from NetEase Games. In a detailed CNCF blog post published May 21, the company’s infrastructure team described how they cut LLM cold-start times by 84x using Fluid, a CNCF incubating project.
The problem NetEase faced is one many organizations running inference on Kubernetes will recognize. Their Tmax AI platform supports the full ML lifecycle on Kubernetes — from notebooks to training to inference. As LLM usage grew across game scenarios (intelligent NPCs, content generation, internal AI services), three problems collided:
- GPU resources were scarce and heterogeneous, making static provisioning inefficient.
- Inference traffic was bursty and non-uniform, with peaks varying by title and time of day.
- Cold starts were dominated by model loading — pulling hundreds of gigabytes of weights for 70B-class models could take tens of minutes, completely erasing the value of autoscaling.
The team tried traditional Alluxio caching, which brought load times down from 42 minutes to 14 minutes. But the breakthrough came with Fluid, which sits between Kubernetes and cache layers like Alluxio to provide Kubernetes-native dataset abstractions and lifecycle management. With Fluid’s prefetching workflows, the same workload dropped to 3 minutes — and with further optimization, NetEase achieved 30-second cold starts.
Fluid’s value proposition is operational abstraction. Instead of managing Alluxio master and worker clusters separately, Fluid automates runtime deployment and lifecycle management. It supports cache elasticity through HPA and KEDA, enables data-aware scheduling to align compute placement with cached data, and provides dataset-level logical isolation with cross-namespace sharing. NetEase also noted that Fluid decouples the dataset abstraction from the runtime layer, allowing them to switch between Alluxio, JindoCache, or JuiceFS without rewriting operational logic.
The lesson, as the NetEase team put it: “Elastic compute is only useful if data can move just as fast.”
The Bigger Picture: Cloud Native Infrastructure Meets AI Reality
Taken together, these three stories illustrate how the CNCF ecosystem is evolving in response to the AI infrastructure wave.
Kyverno shows that policy and security cannot be afterthoughts when running AI workloads. As models gain the ability to autonomously discover and exploit vulnerabilities — Anthropic’s Mythos model recently demonstrated this by finding zero-day flaws in major operating systems — the traditional “prevent or detect” security model is being questioned. Kyverno’s hardened HTTP execution and scoped token authorization are steps toward a more defensible posture.
Microcks represents the API-layer maturity that AI systems require. As AI agents increasingly interact with services via APIs rather than browser UIs, the need for reliable contract testing and mocking grows. Microcks’ bet on MCP protocol support positions it well for this transition.
Fluid addresses the data-path bottleneck that many AI-on-Kubernetes deployments hit. The NetEase case study proves that Kubernetes-native data orchestration can make the difference between serverless inference being an architectural dream and an operational reality.
All three projects are also connected by a deeper trend: they abstract complexity away from operators. Kyverno handles policy enforcement so teams don’t write custom admission webhooks. Microcks generates mocks from specs so developers don’t hand-roll test servers. Fluid manages dataset lifecycles so platform teams don’t manually coordinate cache warmups. This is the cloud native promise at work — not just running containers, but making infrastructure programmable, observable, and manageable at scale.
Upcoming events will likely accelerate these trends. KubeCon + CloudNativeCon India lands in Mumbai on June 18-19, 2026, where expect to see more end-user stories like NetEase’s and deeper dives into AI infrastructure patterns. The CNCF TAG Developer Experience also recently published early findings from a survey on AI’s impact on open-source development, suggesting this conversation is just getting started.
Sources
- Announcing Kyverno release 1.18! | CNCF Blog
- Microcks becomes a CNCF incubating project | CNCF Blog
- How NetEase Games achieved 30-second LLM cold starts on Kubernetes | CNCF Blog
- AI sandboxing is having its Kubernetes moment | CNCF Blog
- Kubernetes for platform teams: Leveraging k0s and k0rdent | CNCF Blog
- How NetEase Games cut LLM cold starts from 42 minutes to 30 seconds | The New Stack
- Kyverno Releases | GitHub
- Our billing pipeline was suddenly slow. The culprit was a hidden bottleneck in ClickHouse | Cloudflare Blog
- Aamchi Mumbai: A KubeCon + CloudNativeCon field guide | CNCF Blog
- The state of AI in CNCF projects: A first look at the data | CNCF Blog
