Safe Agentic A working canon · v0.27
Pillar 03 · Actions

Act in the real world
with intent.

The agent’s ability to act externally in the real world.

Pillar at a glance
Criteria 10
Realistic target 2.0
Current maturity
Recipes available 4
§ criteria
3.1
Structured state read access
agent has read access to the project's structured state stores: application database, infrastructure-as-code state (Terraform/Pulumi/Cloudflare config), and equivalent backends. Scores *capability and usability*; PII masking and IAM scoping are scored under `PL4-least-privilege` / `PL4-pii-masking`
current · target
3.2
Emission quality
code produces structured, correlated, breadcrumb-style signal. Silent handlers are a bug, not a feature. Correlation identifiers reference production entities via pseudonymous tokens (user-ID, session-ID, request-ID), not PII-derived values (email, phone, name) — logs are an engineering surface, and `PL4-pii-masking` applies
current · target
3.3
Agent queryability
agent can investigate via MCP / API across logs, metrics, traces — not just humans via dashboards
current · target
3.4
Memory substrate exists
agent has a unified memory tool (markdown + vector DB + structured DB + event log, MCP-exposed) covering decisions, postmortems, customer-context references (pseudonymous; raw PII does not enter the memory substrate), performance baselines
current · target
3.5
Source control interaction
agent works the git platform end-to-end: branches, opens PRs, resolves merge conflicts, tags releases, *and* reads PR history, review comments, commit metadata
current · target
3.6
Domain-specific action skills
skills that let the agent complete real, end-to-end domain flows (e.g. virtual EV chargers and disposable payment cards for charging workflows; test tenants for multi-tenant SaaS; sandbox accounts for integrations). Examples are per-project; what matters is that critical domain flows don't require a human in the loop
current · target
3.7
Deployment and CI/CD interaction
agent triggers CI/deploys end-to-end (Fastlane, TestFlight, staging) *and* reads results: test outcomes, build logs, historical run data, coverage and mutation reports
current · target
3.8
Browser / web interaction
agent can interact with web UIs: navigating dashboards, filling forms, verifying deployed staging visually. Browser actions are **deterministic** (reproducible across runs), **inspectable** (humans can read what the agent will do before it runs), and **version-controlled** (automation artefacts live with code)
current · target
3.9
Communication actions
agent can notify and present results via external channels: Slack messages, Linear comments, email summaries, docs-portal publishing. Outbound passes through a structural safety layer
current · target
3.10
Skill library health
beyond individual skills (3.1–3.9), the project's skill library *as a whole* is well-curated: inventoried, documented, tested, versioned, with coverage mapped against the domain surface. Distinct from 5.10 which scores portfolio-level reuse — this scores *this project's* skill infrastructure
current · target