Agent runtime infrastructure for regulated production. Five pillars. Open by design.
cogward provides the governed runtime as a single deployable unit. The execution engine is part of the plane. Frameworks keep their native development model. The plane asserts five pillars required for regulated production — customer-owned, not managed by us.
See the architecture page for the technical design decisions: runtime/engine encapsulation model, adapter SDK contract, MCP gateway enforcement, and deployment topology. This page covers what the platform does, capability by capability — organised by the five pillars.
Staging
Availability model
cogward is staged intentionally.
Phase 1 — Estate Foundation (design partners) Registry, workload identity, governed tool access, sandboxed execution, lifecycle authority, resource controls, audit events, events journal, REST API, and Tier A deployment. Deployable today with design partners.
Phase 2 — Production Governance Approval workflows, policy hierarchy, production promotion gates, compliance reporting, and GRC/SIEM depth.
Phase 3 — Intelligence & Evidence Maturity Goal tracking, production evaluation depth, memory governance, lineage, policy sandbox, Tier B/C operational maturity, and estate intelligence at full maturity.
cogward governs access to tools, data, memory, knowledge, and skills. It does not replace those systems.
One authorisation point for every tool call.
The MCP gateway is the permitted path for registered tools. Per-agent and per-tenant permission scope is enforced on every invocation. The gateway is paired with sandbox egress control; the gateway is only meaningful when bypass paths are closed.
Phase 1 runtime control operates at the infrastructure layer: the MCP gateway enforces declared tool permissions, the sandbox enforces egress boundaries, and the four control planes hold simultaneously regardless of agent behaviour. An agent cannot call a tool it is not declared to use, cannot reach an undeclared network endpoint, and cannot escape the sandbox. These controls enforce declared permissions at the infrastructure level.
Phase 2 adds pre-execution policy evaluation: per-call decisions that permit, block, or escalate a tool call before it executes, evaluated against a configurable policy hierarchy. This enables nuanced runtime governance — blocking specific actions based on runtime context, escalating high-risk calls for human approval, and enforcing policy rules that go beyond declared permission scope.
Phase 1 — design partners
- MCP gateway enforcement
- per-agent tool scope
- per-tenant tool scope
- egress-control coupling
- permit / block decisions
- tool-call audit events
Next
- escalate decisions and human approval queues
- HITL workflows into ServiceNow, Jira, and PagerDuty
- policy versioning with approval history
- compliance policy templates
Process and network isolation per agent run.
Graduated isolation profiles from high-isolation (gVisor / Firecracker) to standard production. Profile-per-agent-class defined in the manifest. Bypass-resistant by design: network egress, agent identity, filesystem access, and tool API egress are all mediated simultaneously. All four must hold — an agent that can make a direct HTTP call or mount credentials from the environment has escaped the boundary.
Isolation violations recorded as security events in the evidence chain. Memory and session access governed at the retrieval API level — the agent cannot see or bypass the tenant filter through prompt construction.
Model calls are governed production dependencies.
The same enforcement model that governs tool calls governs model calls. Which model providers are permitted per tenant. Which models are permitted per deployment tier. Whether prompts can be sent to external model APIs — a data residency decision enforced at the boundary, not a product preference. Whether model versions are pinned across tenants or per-tenant. Model calls are governed production dependencies with the same audit trail as tool calls.
Phase 1 — design partners
- Provider allowlist per tenant and per deployment tier
- Data residency enforcement — prompts do not leave the permitted boundary
- Model version pinning — per-agent or per-tenant
- Model call audit trail — provider, model, token count, latency, cost, policy decision
You cannot govern what you cannot enumerate.
Every agent in the estate is registered with a manifest: identity, owner, permissions, tool access, sandbox profile, resource budget, lifecycle state, and evidence requirements. One canonical, queryable record across every framework and team. No agent executes without being registered. No agent can claim permissions it was not registered with.
Registration is the gate — not a development workflow. Governed onboarding: package, validate, run pre-registration policy check, promote. Violations are blocked at registration, not discovered in production incidents.
Every agent authenticates as itself.Every action traces back to a user and an agent.
cogward treats agents as first-class identity principals — not users, not service accounts, not borrowed credentials. Built on OAuth 2.1. Every action is attributable to a specific non-human identity.
Producing audit records that satisfy SOC 2 CC6.1 attribution and DORA incident-traceability requirements. No new identity silo. Standard OAuth 2.1 tokens your existing API gateways and resource servers already validate.
Registration. Every agent gets a unique machine identity (client_id), a human sponsor, a declared scope ceiling, and an allowed-audience list.
Delegation, not impersonation. OAuth token exchange (RFC 8693). The resulting token contains both the user (sub) and the agent (act claim). No token ever represents the user alone when an agent is acting.
Audience-bound, short-lived tokens. Every token is scoped to a single resource server via RFC 8707. Tokens expire in minutes. Confused-deputy attacks are structurally prevented.
Scope attenuation on every hop. The agent's registered scope ceiling is the hard cap. Sensitive scopes (payments.*, pii.export) trigger out-of-band approval before token issuance.
Autonomous agents get their own authority class. An autonomy_class tag drives policy evaluation and audit routing. A user-delegated agent cannot fall back to autonomous authority.
Delegation chains. Phase 2 When one agent delegates a subtask to another, the delegation is a first-class identity event. Both the delegating agent and the sub-agent carry their own machine identities in the audit record. The sub-agent's permission boundary is constrained to the scope of the delegating agent — it cannot acquire permissions the delegating agent does not hold. Delegation chains are preserved and attributable across any depth of multi-agent workflow.
sub — whose authority (the user, or the agent itself for autonomous flows)
act — which agent (the agent's registered identity)
aud + scope — what it was allowed to do (resource + permissions)
Cryptographic binding to the specific agent instance
Out-of-band approval ID, approver identity, and timestamp for sensitive actions
Authenticates users · manages directory · enforces MFA · issues initial user token.
Registers agents as OAuth clients · enforces scope ceiling and audience allowlist · triggers OOB approval · writes identity into evidence chain.
Stopping a runaway agent should not mean restarting everything.
Full lifecycle state machine: registered → staging → active → suspended → draining → cancelled → decommissioned. Every transition is operator-initiated, reason-recorded, and evidence-linked. Suspend a run. Cancel an agent. Kill a version. Drain a tenant. At whatever scope the incident requires — without touching what runs alongside it.
Lifecycle authority lives exclusively with the runtime. No framework, no capability component, no adapter can take it back. This is what makes the kill switch real rather than advisory.
Version rollback, graceful cancel, and drain are all durable execution primitives — not application-level state machines. Agent-to-agent governance: PII propagation boundaries, sub-agent permission inheritance (deny by default), delegation chain attribution. Later
If your deployment requirement is Tier B or Tier C, we are actively seeking design partners whose on-premise and air-gapped security requirements will harden the Tier B and Tier C deployment models. Talk to us before Phase 2 ships — design partners have direct input into the deployment architecture. Book a design-partner briefing →
Version governance for the full agent package.
A prompt change can alter agent behaviour as much as a code change. A model version change can improve one task and regress another. Version governance applies to the full agent package — code, prompts, tool definitions, model choice, model parameters, and policy configuration — not just source code.
Phase 1 — design partners
- Agent versioning — version-controlled agent packages; every version has a declared composition: code, prompts, tools, model, policy
- Staged promotion — development → staging → production; every staging-to-production promotion is a documented decision with evaluation results recorded in the audit log
- Per-tenant enablement — specific agent versions enabled or disabled per tenant without affecting other tenants
- Rollback — revert to a previous safe version at agent or version scope; emergency rollback available during an incident without a new deployment cycle
Phase 2
- Evaluation gate — pre-production behaviour validation: tool call conformance, frequency bounds, scope violation detection; policy simulation against historical production traces
- Canary release — new versions rolled out to a subset of tenants before full rollout
- Policy versioning — exact policy in effect at any point in time is retrievable; auditors can reconstruct which policy version governed a specific run
Limits enforced before consumption — not discovered in the invoice.
Phase 1 — design partners
- per-agent, per-tenant, and per-run limits
- token and compute ceilings
- progressive throttling
- circuit breakers on tool-call error cascades
- cost attribution from registration metadata
Next
- cost-center dashboards
- chargeback export to SAP, Oracle, and internal finance tooling
- budget policy simulation before production promotion
Two components. Two audiences. Never combined.
The audit log is the compliance artifact. The events journal is the operational record. They are related, but never combined.
Phase 1 — design partners
- hash-chained audit events
- append-only evidence model
- privacy-preserving event structure
- run, tool, policy, resource, and lifecycle events
- OTel-compatible operational event stream
Next
- compliance reporting templates for SOC 2, DORA, and NIST AI RMF
- GRC export
- audit team query interface
Later
- snapshot-linked forensic replay
- data lineage from output back to knowledge and tool sources
- sector-specific evidence packs
Two artifacts. Two audiences. Same plane.
The events journal answers engineering questions. The audit ledger answers compliance questions. These are different artifacts for different audiences — produced by the same plane, serving different purposes.
The events journal is a real-time operational stream: every tool call, model call, policy decision, lifecycle event, token count, latency measurement, and failure reason. Structured. OTel-compatible. Exportable to any OTel-compatible backend.
Phase 1 — design partners
- OTel export — compatible with Datadog, Splunk, Grafana, and any OTel ingestion pipeline
- GenAI semantic conventions — agent execution data structured to emerging GenAI OTel conventions
- Run-level visibility — goal, model calls, tool calls, tokens, cost, latency, failure reason per execution
- Aggregate visibility — by tenant, agent, version, team, model, tool, cost, latency, status
- SIEM connectors — export to existing security information and event management tooling
The audit ledger is the compliance record — hash-chained, tamper-evident, independently verifiable. For regulators, auditors, and risk committees. The events journal is the operational stream — for engineers, SREs, and platform teams debugging behaviour, monitoring performance, and investigating failures. Neither is a substitute for the other.
Per-customer behavioural intelligence.No cross-tenant data sharing.
The plane is where identity, declared purpose, lifecycle context, policy record, and outcome data are recorded together. That combination produces a class of intelligence — drift detection, fleet patterns, goal achievement attribution — that no standalone monitoring tool, managed cloud runtime, or execution engine can replicate.
Phase 2
- Goal achievement tracking — declared goal, completion rate over time, per-agent and per-programme
- Cost & latency baselines
- Drift baselines for Phase 3 detection
Phase 3
- Behavioural drift detection — cost, latency, tool sequence, goal achievement, policy escalation rate
- Fleet pattern intelligence — underutilisation, overload, correlated degradation, escalation hot spots
- Air-gapped intelligence distribution — pre-trained behavioural model updates via the same channel as software releases
The human interface to the governed estate.
Every capability above is queryable, controllable, and observable from the Control Center — and from the full REST + event stream API surface. Build your own dashboards without loss of fidelity.
Phase 1 — design partners
- Platform Engineer view: runtime health, adapter status, resource utilization, deployment configuration, infrastructure alerts
- Security view: agent identity console, policy status, audit event search, incident timeline
- REST API and event stream parity for runtime operations
Next
- Agent Owner view: per-agent status, run history, cost attribution, approval queues
- Compliance view: evidence exports, control mappings, GRC workflows
Same platform.Four different jobs.
Run agents across teams, frameworks, and customer‑owned environments.
- Sandboxed runtime without rebuilding the execution layer
- Adapter model keeps existing framework investment intact
- Suspend, drain, kill, rollback at any scope
Machine identity at registration. Customer-auditable enforcement below the framework layer.
- Machine identity issued at registration — never borrowed from a user account
- Dual attribution on every record
- Pen-testable enforcement layer inside your environment
Tamper-evident evidence verifiable inside your environment — without calling a vendor.
- Hash-chained, privacy-preserving compliance record
- GDPR-compatible redaction with evidence continuity
- GRC export to ServiceNow, Archer, Vanta
Keep native frameworks. Get lifecycle, metering, and evidence from the runtime.
- No framework lock-in — adapters wrap LangGraph, AutoGen, custom
- Cost attribution and resource limits per run
- Control Center for operators from day one