Pillar 04 · Safe Space
Bound the
blast radius.
Blast-radius containment, so "going wrong" has bounded cost.
Pillar at a glance
Criteria 10
Realistic target 2.0
Current maturity
Recipes available 12
§ criteria
4.1
Environment isolation
staging/production separation with parity-checked isolation AND on-demand production-mirrored replica for load testing
current · target
→
4.2
IAM scoped read-only by default
DB, Kubernetes, AWS
current · target
→
4.3
Branch protection and source-control write scoping
protected branches are locked against direct push and direct merge by any actor, including agents. All changes to protected branches flow through a PR; agents have unrestricted write access to feature/task branches, but write access to protected branches is structurally impossible, not merely discouraged
current · target
→
4.4
PII masking at data-access and telemetry layers
e.g. `pg_columnmask` for DB; scrubbing / allowlists for logs, metrics, traces
current · target
→
4.5
Prompt injection defence at ingestion boundary
all external content entering *persistent* agent context passes through an ingestion sanitization layer before indexing. Scope is durable ingestion paths (memory writes, indexed knowledge, unsupervised scheduled ingestion); interactive turn context in user-supervised sessions is out of scope — blast radius there is contained by Pillar 4 substrate (`PL4-least-privilege`, `PL4-branch-protection`). The layer strips, escapes, or sandboxes instruction-shaped text. The same policy is applied consistently across every ingestion surface — `PL1-real-world-feedback` (real-world feedback loop), `PL5-signal-driven-tasks` (signal-driven task generation), `PL4-memory-safety` (memory write-path)
current · target
→
4.6
Egress capability scoping at emission boundary
all outbound communications from unsupervised agent paths (chat posts, webhook calls, email sends, HTTP requests, image-rendering URLs, link-preview fetches) pass through an egress gate before leaving the trust boundary. Scope is application-layer egress from automated / scheduled / unattended agent action; interactive responses in user-supervised sessions are out of scope — symmetric with `PL4-prompt-injection-defence`'s ingestion-scope narrowing. IAM-level resource writes are covered separately by `PL4-least-privilege`. Gate enforces destination allowlists per channel, rate limits per destination, elevation gates on novel destinations. Content-based output scanning is defence-in-depth, not primary
current · target
→
4.7
Canary / blue-green / partial release
percentage rollouts with metric-driven promotion, with the agent structurally bounded by platform constraints so it cannot bypass rollout stages or exceed policy-defined parameters
current · target
→
4.8
Rollback is trivial and agent-invokable
current · target
→
4.9
Operating cost is observable, capped, and attributed
agent inference, CI minutes, log retention, canary spin-up costs are tracked per project / per agent run
current · target
→
4.10
Memory safety
hygiene (staleness, contradiction, decay), access control (PII safety, tenant scoping), write-path validation (adversarial-write protection), and retention discipline over the memory substrate (`PL3-memory-substrate`)
current · target
→