Appearance
Operations and Gates
This page defines the production operating baseline.
What this page is for
- Decide whether a deployment is safe enough to release.
- Define the minimum operating evidence for production.
- Use replay to debug incidents and verify fixes.
Daily checks
Health:
bash
curl -sS http://localhost:${PORT:-3001}/health | jqCore production gate:
bash
npm run -s gate:core:prod -- --base-url "http://localhost:${PORT:-3001}" --scope defaultPolicy sanity:
bash
curl -sS http://localhost:${PORT:-3001}/v1/memory/rules/evaluate \
-H 'content-type: application/json' \
-d '{"tenant_id":"default","scope":"default","context":{"intent":"support_triage"}}' | jq '{matched}'Go-live minimum
Before production traffic, confirm all of the following:
- Health checks pass from the real deployment environment.
- Core production gate completes without blocking failures.
- Auth, rate limiting, and tenant isolation are enabled.
- One end-to-end workflow produces a complete replay chain.
- Operators know where to find logs, metrics, and rollback procedure.
Replay Execution
Replay identifiers:
request_idrun_iddecision_idcommit_uri
Operational replay workflow:
- Extract failing
request_idfrom logs. - Rebuild run/decision chain with
run_idanddecision_id. - Resolve affected objects via
POST /v1/memory/resolveandcommit_uri. - Replay workflow and compare with expected behavior.
Release evidence checklist
- Core gate summary.
- Health and consistency outputs.
- Performance snapshot.
- Replay evidence chain for at least one workflow.
Regression checks
bash
npm run -s e2e:replay-learning-fault-smoke
npm run -s e2e:replay-learning-retention-smoke
RUN_REPLAY_LEARNING_SMOKES=true npm run -s regression:oneclickWhat to archive after a release
- Gate outputs and health evidence.
- The deployment version or commit linked to the release.
- At least one replayable workflow trace from the new release.
- Any policy changes introduced with the release.