Introduction
Most organizations are already using AI agents in their development workflows. The question is whether those agents are governed or fall under the category of ‘shadow AI’. Without a strategic architecture, teams end up with unmanaged tool sprawl, inconsistent configurations, unclear security boundaries, and unpredictable cloud-inference bills. Docker’s newer AI-focused building blocks can be composed into a repeatable, governed developer workflow that reduces developer friction while giving platform and security teams isolation, visibility, and policy enforcement at each layer.
TL;DR
The tables are a good place to start; from there, you can decide where to dive deeper.
Note: For a high level summary of this blog post, see From Shadow AI to Enterprise Asset: A Seven-Layer Reference Architecture for Docker’s AI Stack.
Who This Is For
This article is for platform engineering, developers, security, and DevEx leaders who need a practical way to move AI agents and tools from a “cool experiment” to a governed, repeatable workflow. You’ll get a 7-layer architecture that maps Docker’s AI tools to real enterprise concerns (identity, supply chain, isolation, observability, and tool governance), plus clear “where it fits” guidance for each layer. This isn’t a full implementation guide or a product pitch; it’s a mental model and blueprint you can use to align teams, pick controls, and reduce Shadow AI drift.
This post walks through the architecture in the order a developer typically encounters each capability. In practice, several of these layers operate concurrently.
The Seven Layers
| Layer | Docker Tool(s) | What It Does |
|---|---|---|
| Foundation | Docker Hardened Images + Registry Access Management + Image Access Management | Hardened/minimal base images; registry allowlisting (RAM) and Docker Hub image-type controls (IAM) to reduce exposure to unapproved sources |
| Definition | cagent | Declarative YAML agent configs with root/sub-agent orchestration |
| Inference | Docker Model Runner + Remocal/MVM | Local-first model execution using Minimum Viable Models (MVMs) via a local model runtime (Model Runner); optional remote workload execution with Docker Offload when you need to scale beyond local resources |
| Execution | Docker Sandboxes | MicroVM isolation: each sandbox gets its own VM with a private Docker daemon; the agent can build/run/test without access to the host Docker environment |
| External Access | MCP Gateway + MCP Catalog/Toolkit | Gateway-managed MCP tool servers run in isolated containers with restricted privileges/network/resources, plus built-in logging and call tracing for governance; the Catalog provides curated MCP servers packaged as Docker images |
| Observability | Docker Scout + MCP Gateway logging | Continuous Software Bill of Materials (SBOM) and Common Vulnerabilities and Exposures (CVE) visibility across images (Docker Scout) + tool-call logs and traces from MCP Gateway for visibility and governance |
| Identity | SSO + SCIM | Authentication and user provisioning to support identity-based access control at scale |
Two additional Docker Business capabilities complement these layers:
| Capability | Docker Tool(s) | What It Does |
|---|---|---|
| Build Acceleration | Docker Build Cloud | Offloads image builds to cloud infrastructure to reduce build times and improve CI/CD feedback loops |
| Standard Container Isolation | Enhanced Container Isolation (ECI) | Strengthens isolation for standard containers using Linux user namespaces (limiting impact of malicious containers by reducing effective privileges) |
Executive Snapshot: What each layer buys you (ROI + controls)
| Outcome | Docker Tool(s) | Why It Matters |
|---|---|---|
| Lower AI spend + faster iteration | Docker Model Runner + Remocal/MVM | Run more of the dev loop locally to reduce paid API calls and latency during iteration. |
| Safe autonomy for agents | Docker Sandboxes | MicroVM isolation + fast reset reduces host risk and cleanup time when agents misbehave. |
| Governed tool access | Docker’s MCP Catalog + Toolkit (including MCP Gateway) | Centralize tool servers, apply restrictions, and capture logs/traces for visibility. |
| Stronger supply-chain posture | Docker Hardened Images + Registry Access Management + Image Access Management | Standardize hardened bases and prevent pulling from unapproved sources. |
| Fewer vuln/audit fire drills | Docker Scout + MCP Gateway logging | Continuous SBOM and CVE visibility across images + tool-call logs/traces improves triage and audit readiness. |
| Faster CI + hardened non-agent containers | Docker Build Cloud + Enhanced Container Isolation (ECI) | Reduce build bottlenecks and strengthen isolation for everyday (non-agent) containers. |
The Architecture: Layer by Layer
1) Foundation — Approved images and hardened supply chain (Docker Hardened Images + Registry Access Management + Image Access Management)
Before any agent runs, the platform engineering team establishes a secure foundation that underpins the entire pipeline. Docker Hardened Images (DHI) follow a distroless philosophy, stripping unnecessary components like shells, package managers, and debugging tools to dramatically reduce attack surface and improve vulnerability posture through curated maintenance and transparent CVE reporting. (Docker, “Introducing Docker Hardened Images”; Docker, “FedRAMP Compliance with Hardened Images”) Every image ships with a complete Software Bill of Materials (SBOM), SLSA Build Level 3 provenance, and transparent public CVE data. (Docker, “Introducing Docker Hardened Images”)
DHI is now free and fully open source under the Apache 2.0 license, with a catalog of over 1,000 images and Helm charts built on Alpine and Debian. Organizations needing SLA-backed CVE remediation (under seven days for critical and high-severity vulnerabilities), FIPS-enabled images, or extended lifecycle support can upgrade to DHI Enterprise. (Docker, “Hardened Images for Everyone”)
This foundation layer applies broadly across the architecture. DHI base images underpin the application containers developers build and the MCP server containers that agents call through the Gateway. Docker is actively extending its hardening methodology to MCP server images, with hardened versions of popular servers like Grafana, MongoDB, and GitHub already available. (Docker Press Release, December 17, 2025)
To enforce this foundation at the organizational level, Docker’s Registry Access Management (RAM) provides DNS-level filtering that controls which registries developers can access through Docker Desktop. Admins maintain an allowlist of approved registries; any pull or push to a registry not on the list is blocked. Docker’s Image Access Management provides complementary controls over which types of images can be pulled from Docker Hub — Docker Official, Verified Publisher, Organization, or Community images — including repository allow lists. Docker recommends combining both for comprehensive coverage. (Docker Docs, “Registry Access Management”; Docker Docs, “Image Access Management”)
Cloud Native Now reports that future Docker Desktop updates will enable teams to publish and manage their own MCP servers using enterprise controls such as Registry Access Management (RAM) and Image Access Management (IAM), letting platform teams apply familiar governance mechanisms to agent tooling. (Cloud Native Now, “Docker Embraces MCP”)
Operational note: Because distroless images omit interactive debugging tools, Docker recommends approaches like attaching a debug sidecar (e.g., Docker Debug) or using multi-stage builds rather than modifying the hardened image itself. (Docker Docs, “Distroless Images”)
2) Definition — Agents as configuration (cagent)
Instead of ad-hoc Python scripts, developers define agents in a declarative YAML configuration. Docker’s open-source cagent framework uses a “root agent + sub-agents” model, where the root agent delegates work to explicitly defined sub-agents, each with its own model, parameters, and tool access. cagent is bundled in Docker Desktop 4.49 and later, so developers can start building agents without a separate installation step. (Docker, “cagent Comes to Docker Desktop”)
cagent supports external tools via MCP servers, and offers model flexibility across providers — OpenAI, Anthropic, Google Gemini, or local models via Docker Model Runner — without rewriting the overall agent workflow. Agents are packaged as OCI artifacts, meaning they can be pushed, pulled, and shared through Docker Hub or any OCI-compatible registry, just like container images. This makes agent configurations versionable, reviewable, and distributable through the same supply-chain controls the platform team already enforces for container images. (Docker Docs, “cagent”)
3) Inference — Local-first economics (Remocal + Docker Model Runner)
Docker’s “Remocal” (Remote + Local) framing encourages pairing local-first development with “Minimum Viable Models” (MVMs) — the smallest, most efficient models that solve the core problem effectively. The idea is to iterate quickly, reduce dependency on external APIs, and keep cost and latency more predictable during development, reserving cloud-scale models for production workloads that genuinely require them. (Docker, “Remocal + Minimum Viable Models”)
Docker Model Runner is the local execution layer. It runs models locally and serves them through OpenAI-compatible and Ollama-compatible APIs, with support for three inference engines: llama.cpp (the default, works on all platforms), vLLM (for high-throughput inference on NVIDIA GPUs), and Diffusers (for image generation on Linux with NVIDIA GPUs). Models are stored as OCI artifacts and cached locally after the initial pull. (Docker Docs, “Model Runner”)
Because cagent’s YAML config specifies which model to use, swapping between a local MVM and a cloud-hosted frontier model is a one-line change — no workflow rewrite required. This is where the Definition layer and Inference layer connect directly.
When workloads exceed local resources, Docker Offload extends your local Docker workflow into cloud infrastructure. It uses the same Docker CLI commands (docker build, docker run) but executes them on cloud-hosted machines, with NVIDIA L4 GPU acceleration currently in beta and broader GPU support described as arriving in 2026. (Docker, “Model Runner Product Page”; Docker, “Docker Offload”)
4) Execution — Isolated autonomy (Docker Sandboxes)
When an agent needs broad freedom, such as editing files, installing dependencies, running tests, or building containers, Docker Sandboxes provide the execution environment. Each sandbox is a dedicated microVM with its own kernel, filesystem, and private Docker daemon. The agent can build images, start containers, and run tests without any access to the host Docker environment. Only the project workspace is mounted; the agent cannot see host containers, images, volumes, or the host Docker daemon. (Docker Docs, “Sandboxes Architecture”)
This is the layer that wraps and contains the work defined in layers 2 and 3. The developer defines the agent (cagent YAML), selects a model (Model Runner or cloud), and then the sandbox provides the isolated environment where all of that actually executes. Docker describes it as hypervisor-level isolation: unlike containers (which share the host kernel), the sandbox VMs have separate kernels and cannot access host resources outside their defined boundaries. Network isolation is configurable via allow/deny lists. When a sandbox is removed, the entire VM and its contents are deleted. (Docker, “Docker Sandboxes: Run Agents Safely”)
Docker Sandboxes currently support Claude Code, Codex CLI, Copilot CLI, Gemini CLI, Kiro, cagent, and custom shell with microVM-based sandboxes available on macOS and Windows. cagent can be run inside a sandbox with YOLO mode for fully autonomous operation. (Docker Docs, “Sandbox Agents”; Docker Docs, “cagent Sandbox”)
For standard (non-agent) containers in everyday development, Enhanced Container Isolation (ECI) provides complementary protection. ECI runs all user containers with Linux user namespaces via the Sysbox runtime, mapping container root users to unprivileged users inside the Docker Desktop VM. Together, Sandboxes and ECI cover both agent and non-agent workloads. (Docker Docs, “Enhanced Container Isolation”)
5) External Access — Governed tool use (Docker’s MCP Catalog and Toolkit, including MCP Gateway)
Throughout its work, an agent may need to interact with external systems: GitHub, Jira, databases, search APIs, and more. The MCP Gateway governs all of these interactions. It runs MCP servers in isolated Docker containers with restricted privileges, network access, and resource usage. The Gateway manages each server’s full lifecycle: starting a server when an AI application requests a tool, injecting required credentials (managed via Docker Desktop’s secrets store and OAuth flows), applying security restrictions, and forwarding requests. It includes built-in logging and call-tracing capabilities for visibility and governance of tool activity. (Docker Docs, “MCP Gateway”)
The MCP Gateway works behind the scenes for both cagent-defined agents (which specify MCP toolsets in their YAML) and for interactive coding agents like Claude Code and Copilot in VS Code. It acts as a centralized proxy: instead of configuring each MCP server for every client individually, you configure the Gateway once and connect all clients to it. The Gateway is open source. (Docker, “MCP Gateway: Secure Infrastructure for Agentic AI”)
Rather than letting every developer find and run random MCP servers, Docker’s MCP Catalog provides a curated collection of 300+ (and growing) verified MCP servers packaged as Docker images with versioning, provenance, and security updates. Organizations can also create custom catalogs scoped to their approved servers — a natural extension of the “approved images” pattern into “approved tools.” Docker is applying additional trust measures to the MCP ecosystem, including automated review of incoming changes with structured reporting. (Docker Docs, “MCP Catalog”)
6) Observability — Continuous monitoring and audit (Docker Scout + Gateway logging)
A governed architecture requires continuous visibility, not just point-in-time checks. Two capabilities provide this across the stack.
Docker Scout analyzes container images to produce and consume SBOMs, matching them against a continuously updated vulnerability database to identify known issues in image components. This applies to every layer that involves container images: DHI base images, application images, and MCP server images from the Catalog. Scout provides ongoing supply-chain visibility across the full architecture. (Docker Docs, “Docker Scout”)
MCP Gateway logging and call tracing provides the audit trail for agent activity. When logging is enabled (via the --log-calls flag), tool calls that pass through the Gateway are logged and traced — including which tool was invoked and the request/response details. The Gateway also supports interceptors like signature verification (ensuring MCP container images have valid provenance before use) and secret blocking (scanning payloads for credentials that shouldn’t be exposed). Together, these give platform and security teams the observability they need to answer “what did the agent do and why.” (Docker, “MCP Gateway: Secure Infrastructure for Agentic AI”; GitHub, docker/mcp-gateway)
7) Identity — Authentication and policy enforcement (SSO + SCIM)
Identity is the layer that binds all other layers together. Without knowing who is pulling images, defining agents, running sandboxes, or invoking tools, governance policies have no teeth and audit trails have no meaning.
Docker Business supports SSO for authenticating via an identity provider, and SCIM provisioning for ongoing user lifecycle synchronization between an IdP and Docker. This is the prerequisite for policy enforcement across the entire architecture: RAM policies only take effect when users sign in to Docker Desktop with organization credentials, Image Access Management controls are scoped to authenticated users, and MCP Gateway tool access can be governed by identity. (Docker Docs, “Single Sign-On”)
When a developer joins or leaves the organization, SCIM ensures their Docker access is provisioned or revoked automatically. This closes the loop on the governance model — ensuring that the Foundation layer’s image controls, the External Access layer’s tool governance, and the Observability layer’s audit trails all tie back to verified, managed identities.
Strategic Conclusion
Composed together, these seven layers form a governed architecture for AI agent workflows:
- Foundation standardizes trusted images and limits unapproved sources (Docker Hardened Images + RAM/IAM)
- Definition makes agent behavior reproducible (cagent)
- Inference enables local-first model execution economics (Docker Model Runner + Remocal/MVM)
- Execution isolates autonomous work in microVM sandboxes
- External Access centralizes and governs tool use through MCP Gateway (with curated servers via the Catalog)
- Observability combines supply-chain visibility (Docker Scout) with tool-call logs/traces (MCP Gateway)
- Identity (SSO/SCIM) supports consistent access and lifecycle control at scale.
Two additional Docker Business capabilities complement these layers:
- Build Acceleration improves developer and CI throughput by offloading image builds to cloud infrastructure (Docker Build Cloud).
- Standard Container Isolation strengthens isolation for everyday (non-agent) containers using Linux user namespaces, reducing effective privileges and limiting blast radius (Enhanced Container Isolation / ECI).
The net effect is a path from unmanaged “Shadow AI” to a governed architecture that reduces developer friction, provides isolation and visibility at each layer, and gives platform teams the controls to enforce policy without slowing delivery.
How I Wrote This Article
I spent a week of my free time studying Docker’s recent AI-focused releases. It occurred to me that these offerings could be woven together to create a comprehensive ecosystem that solved a lot of the problems developers and enterprise teams are facing these days.
I read documentation and used Gemini and Google’s NotebookLM (audio overviews, slides, videos, chat) to build my understanding of each component. This was the conceptual work of finding the patterns, figuring out how these separate products compose into a layered architecture, and identifying the connections between them.
Once I had found the common thread of how all these features could be woven into a comprehensive architecture, I used multiple different AI tools and my own writing and research skills to iteratively write and fact check this article. I finished off the process with one last manual overview.
