ollama Archives - The Stack Observer

Tag: ollama

Ollama v0.18.2 Adds Web Search for OpenClaw and Non-Interactive Mode

March 20, 2026•Stackxx•AI

Ollama now ships with web search/fetch plugins for OpenClaw and introduces headless mode for CI/CD and automation workflows.

Ollama Ships Web Search and Fetch Plugins for OpenClaw

March 19, 2026•Stackxx•Agentic AI, AI

Ollama v0.18.1+ brings web search and fetch plugins to OpenClaw, letting local models access current information without JavaScript execution.

Ollama 0.18: OpenClaw Integration and Nemotron-3-Super for Agentic AI

March 16, 2026•Stackxx•AI

Ollama 0.18 brings official OpenClaw provider support, up to 2x faster Kimi-K2.5 performance, and the new Nemotron-3-Super model designed for high-performance agentic reasoning tasks.

Ollama 0.18.0 hints that local model runtimes are becoming hybrid control planes

March 14, 2026•Stackxx•AI

Ollama 0.18.0 is a short release note, but the three visible changes are telling. Better model ordering, automatic cloud-model connection with the :cloud tag, and Claude Code compaction-window control all point to a local runtime becoming a policy layer between local and remote inference.

Agentic AI: Ollama 0.17.8-rc1 makes local model runtimes a little less brittle where it counts

March 11, 2026•Stackxx•AI

Ollama’s 0.17.8 release candidate is not a flashy model-drop release. It is a runtime-hardening release: better GLM tool-call parsing, more graceful stream disconnect handling, MLX changes, ROCm 7.2 updates, and small fixes that make local inference feel more operational and less hobbyist.

Ollama 0.17.7 and the quiet evolution of ‘thinking controls’ for local models

March 6, 2026•Stackxx•AI

Ollama 0.17.7 adds better handling for thinking levels (e.g., ‘medium’) and exposes more context-length metadata for compaction. It’s a small release that hints at a larger shift: local model runtimes are growing the same control surfaces as hosted LLM platforms.

Ollama 0.17.4/0.17.5: new models, better tool-call parsing, and why local inference UX is converging

March 1, 2026•Stackxx•AI

Ollama’s latest releases add new model options (including Qwen-family variants) and tighten tool-call handling. The bigger story: local inference is standardizing around ‘agent-ready’ APIs.

Ollama 0.17.4 and the rise of local multimodal stacks: Qwen 3.5, LFM 2, and ops considerations

February 28, 2026•Stackxx•AI

Ollama 0.17.4 adds new model families and reminds operators that local AI stacks behave like software distribution, not just inference. Here’s how to manage versions, updates, and safety in a ‘bring-your-own-model’ world.

Agentic tooling is converging: MCP, vLLM 0.16.0, and Ollama 0.16.2 point to a new ‘local agent’ stack

February 17, 2026•Stackxx•AI

Model Context Protocol (MCP) aims to standardize tool connections. Meanwhile vLLM is pushing serving features like async scheduling and speculative decoding, and Ollama is smoothing the local developer experience. Put together, they hint at the next default stack for local agents.

LangGraph + MCP + Ollama: A Reference Architecture for Local Agentic Systems

February 12, 2026•Stackxx•AI

A practical, ops-minded blueprint for running agentic workflows locally: LangGraph for durable state, MCP for standardized tool boundaries, and Ollama for local inference—plus the guardrails that keep it from becoming an unmaintainable demo.

vLLM vs Ollama in 2026: choosing an LLM serving layer your platform team can actually run

February 11, 2026•Stackxx•AI

The ‘LLM inference server’ is quickly becoming a standard platform component. vLLM and Ollama represent two distinct operating models—GPU-first throughput engineering vs developer-friendly packaging. Here’s how to pick based on tenancy, observability, and cost, not hype.