DSO Architecture Guide (SRE & Security Reference)
This document provides a production-grade, SRE and security-focused analysis of the Docker Secret Operator (DSO). It details the internal components, event-driven pipelines, security boundaries, and container lifecycles that govern DSO.
1. Internal DSO Agent Components
The dso-agent is a long-running, concurrent Go daemon designed to run alongside standard Docker engines. It is divided into decoupled, asynchronous event loops that coordinate via memory channels and transactional locks.
graph TD
%% Styling
classDef component fill:#e1f5fe,stroke:#039be5,stroke-width:2px,color:#000000;
classDef storage fill:#efebe9,stroke:#5d4037,stroke-width:2px,color:#000000;
classDef external fill:#f3e5f5,stroke:#8e24aa,stroke-width:2px,color:#000000;
subgraph Host ["Docker Host (dso-agent Daemon)"]
Socket[Unix Socket: /run/dso/dso.sock]:::component
Controller[IPC Controller & API Server]:::component
Watcher[Docker Event Watcher]:::component
Engine[Rotation Engine]:::component
Cache[Encrypted In-Memory Cache]:::component
State[Transactional State Manager]:::component
Lock[Distributed Lock Manager]:::component
PluginMgr[Plugin Process Manager]:::component
end
subgraph Storage ["Persistent State"]
Db[State Directory: /var/lib/dso/state/]:::storage
end
subgraph Plugins ["Isolated Provider Subprocesses"]
AWS[dso-provider-aws]:::external
Vault[dso-provider-vault]:::external
Azure[dso-provider-azure]:::external
end
%% Interactions
Socket -->|gRPC / HTTP IPC| Controller
Watcher -->|Docker Event Stream| Engine
Controller -->|Trigger Manual Run| Engine
Engine -->|Cache Lookup| Cache
Engine -->|State Query / Mutation| State
Engine -->|Acquire Resource Lock| Lock
State -->|Write-Ahead Log / Snapshots| Db
Engine -->|Fetch Secrets| PluginMgr
PluginMgr -->|JSON-RPC over Pipes| AWS
PluginMgr -->|JSON-RPC over Pipes| Vault
PluginMgr -->|JSON-RPC over Pipes| Azure
Component Breakdown
- IPC Controller & API Server: Exposes a secure Unix Domain Socket (
/run/dso/dso.sock) with strict0660permissions owned byroot:dso. It handles commands from thedocker dsoCLI (such as status queries, manual rotations, and environment health checks). - Docker Event Watcher: Listens directly to the Docker engine event stream, filtering for container start, stop, destroy, and edit events to automatically rebuild internal target maps of managed compose services.
- Rotation Engine: The core orchestrator. It manages the rotation queue, debounces high-frequency trigger events (5-second window), and executes transactional blue-green swaps.
- Encrypted In-Memory Cache: Temporarily holds plaintext secrets in memory using locked memory pages (
mlock), preventing secret values from being written to swap space. High-performance cache TTLs avoid rate-limiting cloud provider APIs. - Transactional State Manager: Manages write-ahead logging (WAL) and service mapping metadata in
/var/lib/dso/state/to guarantee deterministic recovery after host crashes.
2. Security Boundaries & Isolation Model
DSO is designed around a zero-trust model regarding the host filesystem and the Docker metadata storage engine. It strictly isolates credentials across multiple execution layers.
graph TB
%% Styling
classDef redZone fill:#ffebee,stroke:#c62828,stroke-width:2px,color:#000000;
classDef amberZone fill:#fff8e1,stroke:#ff8f00,stroke-width:2px,color:#000000;
classDef greenZone fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#000000;
subgraph Cloud ["Secure Cloud Boundary"]
KMS[Cloud KMS / Vault KMS]:::greenZone
SM[Secrets Manager / Azure Key Vault]:::greenZone
end
subgraph Host ["Docker Host System"]
subgraph DaemonSpace ["Root/dso Group Protected Space (mlock)"]
Agent[dso-agent]:::amberZone
Cache[Encrypted In-Memory Cache]:::amberZone
Plugin[Provider Plugins]:::amberZone
end
subgraph UserSpace ["Unprivileged User Space"]
CLI[docker dso CLI]:::redZone
end
subgraph DockerSpace ["Docker Engine Boundary"]
Engine[Docker Daemon]:::redZone
end
end
subgraph ContainerSpace ["Isolated Container Sandbox"]
App[Application Process]:::greenZone
Tmpfs[tmpfs mount: /run/secrets/]:::greenZone
end
%% Flow
SM -->|TLS 1.3 encrypted fetch| Plugin
Plugin -->|Plaintext in RAM| Agent
Agent -->|Mounted directly via tmpfs API| Tmpfs
Tmpfs -->|In-memory files| App
CLI -->|IPC only - No direct secret access| Agent
Engine -->|Orchestrates container launch| ContainerSpace
Key Security Controls
- No Disk Persistence for Secrets: Under no circumstances are plaintext secret values written to physical host disks. They live exclusively in the
mlock-guarded memory space of the agent and inside RAM-basedtmpfsmounts. - Docker Inspect Guard: Traditional Docker secrets or environment variables injected via compose are visible via
docker inspect <container>. DSO injects secrets at the process boundary (for environment variables) or via memory-mapped files (for file-based injection), ensuringdocker inspectonly reveals the dummydso://ordsofile://reference URIs. - IPC Unix Socket Isolation: The communication channel
/run/dso/dso.sockrestricts access to root and members of thedsosystem group.
3. Provider Plugin Architecture
DSO isolates providers (AWS, Azure, HashiCorp Vault, Huawei Cloud) into distinct subprocesses. This ensures that provider SDK dependency vulnerabilities do not compromise the core DSO daemon.
sequenceDiagram
autonumber
participant Agent as dso-agent (Core)
participant Mgr as Plugin Process Manager
participant Plugin as dso-provider-aws (Subprocess)
participant AWS as AWS Secrets Manager API
Agent->>Mgr: Request secret resolution (Provider: AWS, Key: db_pass)
Note over Mgr: Process manager checks if subprocess is active
alt Process Not Running
Mgr->>Plugin: Spawns binary (/usr/local/lib/dso/plugins/dso-provider-aws)
Note over Plugin: Subprocess starts, locks stdin/stdout
end
Mgr->>Plugin: Write JSON-RPC Request to Stdin Pipe
Note over Plugin: Read request & parse credentials
Plugin->>AWS: FetchSecretValue(SecretId="prod/db") via TLS 1.3
AWS-->>Plugin: Return JSON Payload (Encrypted)
Note over Plugin: Parse payload, extract target field
Plugin-->>Mgr: Write JSON-RPC Response to Stdout Pipe
Note over Mgr: Read stdout, enforce 1MB buffer max limit
Mgr->>Agent: Deliver resolved plaintext secret bytes
Note over Plugin: Subprocess kept alive for connection pooling
JSON-RPC IPC Schema
The core daemon communicates with the provider binaries using a structured, line-delimited JSON-RPC protocol over Unix Pipes (stdin/stdout).
- Request Payload:
{"jsonrpc":"2.0","method":"ResolveSecret","params":{"path":"myapp/db_password","config":{"region":"us-east-1"}},"id":1} - Response Payload:
{"jsonrpc":"2.0","result":{"value":"s3cr3t_p4ssword"},"id":1}
4. Runtime Secret Injection & Compose Integration
DSO intercepts the normal docker compose execution flow. It acts as an Abstract Syntax Tree (AST) transformer for Compose YAML files, replacing dso:// and dsofile:// placeholders before they reach the Docker Engine.
graph TD
%% Styling
classDef process fill:#fff3e0,stroke:#ffb74d,stroke-width:2px,color:#000000;
classDef file fill:#eceff1,stroke:#b0bec5,stroke-width:2px,color:#000000;
YAML[docker-compose.yml]:::file
CLI[docker dso up]:::process
AST[AST Parser & Validator]:::process
Cache[In-Memory Secret Resolver]:::process
Agent[DSO Systemd Agent]:::process
Engine[Docker Engine API]:::process
Container[Target Container]:::process
YAML -->|Read Input| CLI
CLI -->|Load YAML Structure| AST
AST -->|Scan for dso:// and dsofile://| Cache
Cache -->|Query Cached Secrets| Agent
Agent -->|Return Plaintext Secret Bytes| Cache
Cache -->|Mutate Compose AST in Memory| AST
AST -->|Generate Real-Time Spec with tmpfs Mounts| Engine
Engine -->|Launch sandboxed process| Container
AST Modification Process
- Placeholder Parsing: DSO parses the YAML file structure and identifies variables prefixed with
dso://(for Environment variables) ordsofile://(for file-based mounts). - Mount Modification: For every
dsofile://myapp/certentry, DSO dynamically injects a temporarytmpfsvolume mount into the service specification before submitting it to the Docker socket, allocating exactly the required memory size.
5. Tmpfs Secret Flow
File-based secrets are written to memory-mapped /run/secrets/ directories, preventing any cryptographic artifacts from touching non-volatile storage.
sequenceDiagram
autonumber
participant Agent as dso-agent
participant Engine as Docker Daemon
participant HostRAM as RAM (tmpfs)
participant Container as Container Process
Note over Agent: Secret resolved in-memory
Agent->>Engine: Create container spec with tmpfs mount at /run/secrets
Note over Engine: Docker creates unprivileged mount namespace
Engine->>HostRAM: Mount RAM-based tmpfs partition at host container path
Agent->>HostRAM: Write plaintext secret bytes directly to mounted RAM path
Agent->>Engine: Start container execution
Engine->>Container: Launch application entrypoint
Container->>HostRAM: Read secret file from /run/secrets/db_password
Note over Container: Secret resides only inside RAM
Note over Agent: On container stop/restart:
Agent->>Engine: Stop container
Engine->>HostRAM: Unmount tmpfs partition (data instantly lost from memory)
6. Secret Rotation Lifecycle & Event Watcher Pipeline
DSO features a real-time event-driven rotation pipeline that monitors both Docker container lifecycles and Cloud Provider secret update triggers.
graph TD
%% Styling
classDef watch fill:#e8f5e9,stroke:#4caf50,stroke-width:2px,color:#000000;
classDef queue fill:#e0f7fa,stroke:#00bcd4,stroke-width:2px,color:#000000;
classDef exec fill:#fff3e0,stroke:#ff9800,stroke-width:2px,color:#000000;
Webhook[Cloud Provider Webhook / EventGrid]:::watch
Poller[Configured Provider Poller]:::watch
Watcher[DSO Engine Watcher]:::watch
Queue[FIFO Event Queue]:::queue
Debouncer[Debounce Logic: 5-second window]:::queue
Orchestrator[Rotation Engine Orchestrator]:::exec
Swap[Execute Blue-Green Swap]:::exec
%% Flows
Webhook -->|Webhook Event| Queue
Poller -->|Polling Change Detected| Queue
Watcher -->|Docker Engine Event| Queue
Queue --> Debouncer
Debouncer -->|Coalesce High-Frequency Events| Orchestrator
Orchestrator -->|Trigger Active Pipeline| Swap
7. Blue-Green Container Replacement
When a secret rotation is triggered, DSO performs an in-place, zero-downtime blue-green container swap to avoid service disruptions.
sequenceDiagram
autonumber
participant Agent as dso-agent
participant Docker as Docker Daemon
participant Blue as Container [Blue] (Running V1)
participant Green as Container [Green] (Spawning V2)
participant Proxy as Load Balancer / Host Port
Note over Agent: Secret rotation triggered for target service
Agent->>Docker: Spawn Container [Green] with updated secrets
Docker->>Green: Launch in background
loop Health Checks
Agent->>Green: Execute HTTP / TCP health probes
end
alt Container [Green] Healthy
Note over Agent: Atomic Swap Phase
Agent->>Proxy: Redirect incoming port traffic to [Green]
Agent->>Docker: Rename [Blue] -> [Blue-Old]
Agent->>Docker: Rename [Green] -> [Blue] (Take over active name)
Agent->>Docker: Terminate and stop [Blue-Old]
Note over Agent: Rotation complete successfully!
else Container [Green] Unhealthy
Note over Agent: Failure Detected! Trigger Rollback
Agent->>Docker: Stop and destroy [Green]
Agent->>Agent: Increment failure metrics & Alert
end
8. Rollback and Recovery Workflows
If a newly spawned container fails its operational health checks during rotation, the transactional state manager performs an automatic, deterministic rollback to keep production services running.
graph TD
%% Styling
classDef step fill:#fafafa,stroke:#616161,stroke-width:2px,color:#000000;
classDef err fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000000;
classDef success fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#000000;
Start[Start Secret Rotation]:::step
Spawn[Spawn Green Container]:::step
Probe{Execute Health Probes}:::step
Swap[Atomic Swap Names]:::success
StopBlue[Stop Blue Container]:::success
Rollback[Stop & Remove Green]:::err
Restore[Restore Blue Port Mappings]:::err
Alert[Trigger Systemd Failure Alert]:::err
%% Path
Start --> Spawn
Spawn --> Probe
Probe -->|Probes Pass| Swap
Swap --> StopBlue
Probe -->|Probes Fail / Timeout| Rollback
Rollback --> Restore
Restore --> Alert
9. SRE Operational Metrics & Health Signals
To monitor DSO health in production environments, track the following operational parameters using Prometheus or docker dso status:
| Metric Name | Type | Description | Alerting Threshold |
|---|---|---|---|
dso_rotation_success_total |
Counter | Total successful secret rotations | N/A (Diagnostic) |
dso_rotation_failures_total |
Counter | Total failed rotations (triggered rollback) | > 0 (Warning) |
dso_provider_api_latency_seconds |
Histogram | Latency to cloud secret provider APIs | > 2.5s (Degraded Performance) |
dso_active_managed_containers |
Gauge | Total containers currently managed by DSO | N/A (Capacity Planning) |
dso_cache_hit_ratio |
Gauge | Ratio of cache hits vs total secret requests | < 0.85 (API rate limit risk) |