S177 — Production-Governed Tools Readiness — Macro Package (next macro)
S177 — Production-Governed Tools Readiness — Macro Package
Next macro after S177-4000x. Scope-only — nothing here is executed.
Goal
Bring production live writes and schema/view live execution into the Gateway behind:
- approval registry artefacts (single-use, scoped, expiring);
- IAM hardening (least-privilege Lark app + GSM access);
- a real rollback story (point-in-time GPG backups + replay procedure);
- monitoring (audit-stream tail + freshness check + paging).
Cowork-facing: every existing tool stays unchanged; new capabilities arrive behind explicit approval IDs.
Workstreams
- W1 Production read-only rollout — confirm every base, add per-base
Cowork read tour, decide on
records_searchexposure. - W2 Production dry-run rollout — already covered; document expectation and add regression test that walks all bases.
- W3 Approval artefacts —
records.approveCLI/endpoint, audit row at creation, Cowork workflow dry-run → review → approve → live. - W4 First governed production write — single record, single field, approval id, GPG backup, follow-up get, rollback drill.
- W5 Schema live on Base đệm temp resources — create/teardown temp table per run; extend BaseDemOnlyApiCaller (or new sibling) with op- specific guards.
- W6 record.batch_delete live — ceiling from lark-api-limits.yaml, per-target marker guard, consolidated GPG backup, rollback hook.
- W7 IAM hardening — audit Lark app perms, GSM least-privilege, rotate bearer token + document runbook.
- W8 Monitoring — synthetic
/healthz+ healthcheck + records_get probe via Kuma/Grafana; audit-stream burst alerts; long-soak >2h token-refresh probe. - W9 Rollback story — document decrypt/replay per backup shape; drill quarterly using a recent Base đệm backup.
Acceptance gates
PASS when (1) every production base has a green read tour, (2) dry-run on every prod base is green, (3) one governed prod write landed with approval/backup/audit/rollback, (4) schema live works on a Base đệm temp table for at least one of each shape, (5) batch_delete live works on Base đệm with marker guard + rollback, (6) IAM narrowed + rotation runbook drilled, (7) monitoring pages on synthetic outage within 5 minutes. PARTIAL with the exact failed gate otherwise.
Out of scope
New bases, new Lark apps/credentials, transport changes, safety-tier definition changes (only their coverage), Cowork UI work.
Inputs from S177-4000x
s177-cowork-full-larkbase-tools-package.md— current surface.s177-cowork-full-larkbase-tools-4000x-evidence-2026-05-23.md— evidence this builds on.config/allowed_endpoints.yaml— 13 write endpoints already whitelisted; no new endpoints in this macro, only api_caller op-guard flips.lark_client/service.py::BaseDemOnlyApiCaller— most edits land here.