%te%% - The Stack Observer

Ollama 0.17.4 and the rise of local multimodal stacks: Qwen 3.5, LFM 2, and ops considerations

February 28, 2026•Stackxx•AI

Ollama 0.17.4 adds new model families and reminds operators that local AI stacks behave like software distribution, not just inference. Here’s how to manage versions, updates, and safety in a ‘bring-your-own-model’ world.

vLLM v0.16.0: serving at scale gets more API-compatible—how to adopt without breaking prod

February 28, 2026•Stackxx•AI

vLLM v0.16.0 ships with a large set of changes and a fast-moving contributor base. To adopt it safely, treat it like an API platform: validate OpenAI-compat endpoints, scheduling behavior, and observability before a fleet-wide cutover.

OpenTelemetry eBPF Instrumentation (OBI) alpha: what ‘zero-code’ tracing changes for platform teams

February 28, 2026•Stackxx•Cloud Native

OpenTelemetry’s eBPF Instrumentation project shipped its first alpha release. Here’s what you gain (and what you still don’t) when you shift observability left—down into the kernel.

Kubernetes 1.35’s ‘Restart All Containers’: why in-place restarts matter for ops and AI workloads

February 28, 2026•Stackxx•Kubernetes

Kubernetes 1.35 introduces an alpha ‘Restart All Containers’ capability that makes a whole‑Pod refresh a first‑class operation. Here’s where it helps, where it can hurt, and how to roll it out safely.

GitHub Actions macos-26 runners GA: what to validate before your CI fleet flips

February 28, 2026•Stackxx•DevOps

GitHub-hosted runners now offer macos-26 generally available. Treat this like a platform migration: validate toolchains, codesigning, caches, and flaky tests before the default image shifts.

OpenClaw’s February Updates: Secrets Workflows, WebSocket-First Codex, and Why Routing Is Becoming the Control Plane

February 27, 2026•Stackxx•AI

OpenClaw 2026.2.25 and 2026.2.26 ship a surprisingly cohesive theme: more reliable delivery, more explicit routing, and a first-class secrets workflow. Here’s what changed—and how operators can actually use it.

OpenTelemetry’s eBPF Instrumentation: What the First Release Changes for Cloud Native Observability

February 27, 2026•Stackxx•Cloud Native

OpenTelemetry’s eBPF instrumentation (OBI) is now shipping an initial release, pushing the ecosystem toward low-friction, kernel-level telemetry—especially for large fleets where manual instrumentation doesn’t scale. Here’s what eBPF-based signals are good for, where they’re risky, and how to roll them out safely in production.

Kubernetes API Governance in 2026: Why ‘Stable’ APIs Still Need a Steering Wheel

February 27, 2026•Stackxx•Kubernetes

Kubernetes keeps expanding its surface area—CRDs, admission policies, Gateway API, and now inference-focused extensions. SIG Architecture’s API Governance work is the quiet mechanism that keeps innovation moving without breaking users. Here’s what ‘API governance’ means in practice, and how platform teams can adopt the same discipline internally.

GitHub Actions Gets Unzipped Artifacts: A Small Change That Fixes Real CI/CD Pain

February 27, 2026•Stackxx•DevOps

GitHub Actions now supports uploading and downloading non-zipped artifacts—reducing friction for single-file outputs, browser-based inspection, and ‘double zip’ anti-patterns. Here’s what changed, how to adopt it safely, and why it’s a useful signal for platform engineering teams standardizing CI at scale.

vLLM 0.16.0 Raises the Bar for Open-Source Inference Serving

February 27, 2026•Stackxx•AI

vLLM 0.16.0 lands with async scheduling and pipeline parallelism, a new WebSocket-based Realtime API, speculative decoding improvements, and major platform work—including an overhaul for XPU support. Here’s why those details matter to teams building reliable, cost-efficient inference stacks.

OpenStack 2026.1 ‘Gazpacho’ Is in Development: How to Plan an Upgrade Path Without Surprises

February 27, 2026•Stackxx•OpenStack

OpenStack’s 2026.1 release series (‘Gazpacho’) is tracking toward an April 2026 initial release, with SLURP upgrade guarantees shaping how operators should plan rollouts. Here’s what the release series table really tells you, how to map it to your internal maintenance windows, and where the OpenInfra community’s ‘digital sovereignty’ messaging intersects with real operations.

Flux 2.8 GA: Helm v4, Faster Recovery, and GitOps Feedback Loops Without Extra CI

February 26, 2026•Stackxx•DevOps

Flux 2.8 lands Helm v4 support (SSA + kstatus health checks), reduces MTTR by canceling health checks when new revisions appear, and expands GitOps feedback loops with PR/MR comment providers and a new Flux Operator Web UI.

GitHub Copilot Gets GPT-5.3-Codex: What ‘Model Pickers’ Mean for Enterprise Dev Workflows

February 26, 2026•Stackxx•AI

GitHub has made GPT-5.3-Codex generally available across Copilot tiers via the chat model picker on github.com, GitHub Mobile, and Visual Studio/VS Code. For enterprises, the key story is policy control and model choice — not just a new model name.

SpinKube + Gateway API: A Practical Path to Routing WebAssembly Apps on Kubernetes

February 26, 2026•Stackxx•Cloud Native

SpinKube runs Spin WebAssembly apps on Kubernetes without containers, using a containerd shim and Kubernetes primitives. Pairing it with the Gateway API gives teams a cleaner, role-oriented way to expose WASM services without annotation sprawl.

Amazon EKS Capabilities: Managed ACK + kro Bring a Kubernetes-Native Platform API to AWS

February 26, 2026•Stackxx•Kubernetes

EKS Capabilities package Argo CD, AWS Controllers for Kubernetes (ACK), and Kube Resource Orchestrator (kro) as managed, Kubernetes-native building blocks. Here’s what changes when platform teams can compose AWS resources and Kubernetes resources behind custom APIs — without running the controllers themselves.

Multi-LoRA at Scale: How vLLM + AWS Aim to Stop Paying for Idle GPUs

February 26, 2026•Stackxx•AI

AWS and the vLLM community describe multi-LoRA serving for Mixture-of-Experts models, with kernel and execution optimizations that let many fine-tuned variants share a single GPU. The pitch: higher utilization, better latency, and a clearer path to serving ‘dozens of models’ without dozens of endpoints.

vLLM 0.16.0 Is Out: Why Inference ‘Release Notes’ Now Belong on the Platform Roadmap

February 25, 2026•Stackxx•AI

vLLM 0.16.0 landed with ROCm-focused fixes and ongoing production hardening. Even when a release looks incremental, inference runtimes are now platform-critical dependencies—affecting cost, reliability, and model portability.

OpenTelemetry eBPF Instrumentation (OBI) First Release: Why ‘Zero-Code’ Telemetry Is Turning Into a Platform Decision

February 25, 2026•Stackxx•Cloud Native

OpenTelemetry’s eBPF Instrumentation project (OBI) just hit its first release. That’s a milestone for low-overhead, zero-code observability—but it also raises new questions about privilege, fleet rollout, and data governance.

Cloudflare’s vinext: Rebuilding Next.js with AI in a Week Signals a New Pattern for ‘AI-Assisted Replatforming’

February 25, 2026•Stackxx•AI

Cloudflare says one engineer and an AI model rebuilt a drop-in Next.js replacement on Vite (vinext) in a week—with big build-time and bundle-size claims. Whether or not the benchmarks hold for every app, the real story is how AI is compressing framework and platform rewrites.

Flux 2.8 GA Lands Helm v4 Support: The Quiet GitOps Upgrade That Changes Rollouts and Drift

February 25, 2026•Stackxx•DevOps

Flux 2.8 GA ships with Helm v4 support, bringing server-side apply and kstatus-based health checking to Helm releases. Here’s why that’s bigger than it sounds—and how platform teams should approach the upgrade.