Dapr’s ‘Conversation’ building block: a practical path to portable LLM workflows in microservices

Teams want “agentic AI” in production, but the first thing they encounter isn’t model quality—it’s integration friction. Different LLM providers ship different APIs, different request/response contracts, and different operational behaviors around retries, rate limits, and tool calling. The result is predictable: provider-specific code spreads through microservices, and switching vendors becomes a multi-quarter refactor.

Dapr is taking an opinionated swing at this problem with its Conversation building block: declare an LLM provider as a component, call the Dapr runtime, and let the sidecar handle the provider-specific details.

Why this is interesting for platform engineering (not just app developers)

Dapr’s “sidecar-as-integration-layer” model is familiar to platform teams because it mirrors how we already standardize:

Service-to-service calls (mTLS, retries, timeouts) via service meshes or gateways
Observability via standardized metrics/logging/tracing pipelines
Secrets via external stores and sidecar helpers

LLM usage is now following the same path: teams need a consistent, policy-enforceable interface that can be observed, throttled, and swapped without rewriting every service.

The key abstraction: providers become components

Instead of embedding provider SDKs in each service, you declare components like:

conversation.anthropic with model + key
conversation.openai with model + endpoint + key

That doesn’t magically eliminate differences between providers, but it moves those differences into a runtime boundary that can be managed and upgraded centrally.

Tool calling and “agentic” behavior: where the runtime helps

The most operationally tricky AI pattern in microservices is tool calling: the model requests a function, a service calls an external system, the result is fed back, and the model continues. Dapr’s tutorial frames this as a conversation workflow with message types:

SystemMessage (role/instructions)
UserMessage (input)
AssistantMessage (response + tool calls)
ToolMessage (tool result)

That structure is useful because it encourages teams to treat LLM interactions as a first-class protocol rather than an ad hoc string concatenation.

What operators should ask before standardizing on it

1) Observability: can you see prompts, latencies, and failures responsibly?

Platform teams will need a policy on what is logged and traced. LLM payloads can include sensitive data. A runtime layer can help enforce redaction and consistent tagging, but only if it’s configured intentionally.

2) Retries and backpressure: who owns reliability semantics?

Dapr’s pitch emphasizes retries and consistent handling. That’s good—until it accidentally amplifies traffic during provider incidents. Make sure you define:

Retry limits and jitter
Circuit breaking behavior
Rate limiting per tenant/service

3) Portability: does the abstraction match your real requirements?

Providers differ in tool calling, streaming, and model-specific features. The abstraction should cover the 80% case without blocking power users. If the runtime forces a “lowest common denominator,” teams may bypass it.

A likely near-term outcome: platform-owned LLM gateways

Dapr Conversation looks like an early step toward a pattern we’re already seeing: platform teams providing an internal “LLM gateway” with standardized auth, auditing, quotas, and model routing. Whether you implement that gateway as Dapr components, a dedicated API service, or a service mesh extension, the direction is the same: move provider churn away from application code.