S177 — Sprint 1 Command Review Package (2026-05-19)
S177 — Sprint 1 Command Review Package
Status: PASS — package ready for GPT/User review. Date: 2026-05-19 Authored by: Claude Code (Opus 4.7) Pairs with:
s177-controlled-crud-gateway-requirements-v2.md(đề bài v2.2)s177-controlled-crud-gateway/s177-architecture-design-2026-05-19-patch2.md(binding design)s177-controlled-crud-gateway/s177-design-patch2-summary-2026-05-19.md(PATCH2 summary)s177-controlled-crud-gateway/s177-r0-code-reconcile-report-2026-05-19.md(R0 source evidence)s177-controlled-crud-gateway/s177-oq-decision-record-2026-05-19.md(OQ-2/-3/-4/-5/-7 closed)s177-controlled-crud-gateway/s177-sprint1-implementation-checklist-2026-05-19.md(paired checklist)s177-controlled-crud-gateway/README.md(folder index)
Live source re-verified in this round (DISCOVER-FIRST): lark_client/core.py, lark_client/audit.py, lark_client/exceptions.py, cli/lark_tool.py, config/allowed_endpoints.yaml, tests/test_core.py. All read read-only via the VPS read_file MCP. No source mutation.
§1. Objective and Sprint 1 scope
Objective: Implement Track B core — controlled record CRUD via CLI — fully behind SafetyLayer, against the live source observed in R0 + this DISCOVER-FIRST pass.
In scope (Sprint 1):
- New module
lark_client/service.py— Application Service Layer for record ops. - New module
lark_client/safety.py— SafetyLayer (8 layers). - New module
lark_client/approval.py—ApprovalProviderABC +YamlApprovalProvider(atomic check-and-consume). - New module
lark_client/gpg_backup.py— GPG public-key backup helper. - New module
lark_client/pii.py—FieldPIIRegistry+PatternPIIDetector. - New module
lark_client/factory.py— composition root for DI wiring. - New module
lark_client/writer.py— typed façade over service (kept thin per PATCH1 §G.1). - Modify
lark_client/core.py— add publicLarkCore.write(...)method. - Modify
lark_client/audit.py— add 4 net-new methods + strict writer variant + extend mask-skip list. - Modify
lark_client/exceptions.py— add 5 net-new exceptions subclassingLarkClientError. - New CLI module
cli/records.py— argparserecordssubparser + handlers. - Modify
cli/lark_tool.py— wire the new subparser (additive only). - Modify
config/allowed_endpoints.yaml— append 6 record-class write entries under existing structured-object schema (resolved this round, §16). - New config
config/write-approvals.yaml— approval registry (schema in §13). - New config
config/pii-fields.yaml— PII field whitelist (schema in §18). - New config
config/lark-api-limits.yaml— rate + batch +write_endpoint_options(schema in §17). - New test infra
tests/conftest.py—LARK_TEST_INTEGRATIONenv gate + Base-đệm token hard assert. - Modify
tests/test_core.py— annotate exactly two live tests with@pytest.mark.integration. - Modify
pyproject.toml— registerintegrationpytest marker (no entry-point change). - New unit-test files under
tests/covering every Sprint 1 acceptance case (T1–T12 + b-variants). - New gated integration test file
tests/test_records_integration.py— Base đệm only.
Out of scope (NOT in Sprint 1):
- ❌ MCP adapter (Track A — Sprint 2).
- ❌ Field create/update/delete (Sprint 3).
- ❌ Table / view / base schema operations (Sprint 4).
- ❌ Production Lark write of any kind.
- ❌ Bot creation, credential rotation, MCP-write enablement.
- ❌ Deploy, service restart, container rebuild.
- ❌ Git push, merge, tag.
- ❌
--pii-strictmode (deferred per OQ-3). - ❌ Orphan-backup sweep command (Sprint 4 per OQ-7).
- ❌ Directus approval backend (Sprint 3 per PATCH2 sprint plan).
- ❌
DirectusApprovalProviderswap test (Sprint 3). - ❌ Production smoke / production probe of any kind.
§2. Lark connection context — reuse-only stance
This Sprint reuses the already-operational Lark connection. No new bot, no new credential, no new MCP connection, no duplicate adapter is created. Concretely (all confirmed by DISCOVER-FIRST this round):
- Bot: "For Gem" (
cli_a785d634437a502f) remains the single Lark bot. Sprint 1 does not touch the bot definition. - Credential path:
LarkCore._load_credentials()already does env →/opt/incomex/docker/.env→ GSM (github-chatgpt-ggcloud, secret namesLARK_APP_ID/LARK_APP_SECRET). Sprint 1 reuses this path verbatim. New secret added by Sprint 1 design:LARK_BACKUP_GPG_PUBKEY(ASCII-armored public key, see §14). The secret is fetched via the same GSM helper (LarkCore._gsm_get) — no new credential plumbing. - MCP topology: unchanged in Sprint 1. The existing
@larksuiteoapi/lark-mcpplugin (9 bitable tools) stays as-is. Sprint 1 introduces zero MCP write paths. Sprint 2 will decide plugin hide-vs-replace per OQ-4. - Endpoint whitelist file: existing
config/allowed_endpoints.yaml(verifiedread:populated,write: []empty) is appended to in §16 — same schema, no new file invented. - Audit dir, rate-limit lock files, secret manager project: all reused, unchanged.
§3. DISCOVER-FIRST proposed-file inventory
Each file is tagged: existing?, reuse-or-new, path, purpose, owner, lifecycle. Existing files were read in this round.
A. Existing files — MODIFY (additive only)
| Path | Existing? | Reuse / extend | Why touched | Modification kind |
|---|---|---|---|---|
lark_client/core.py |
YES (read this round) | EXTEND | Add public write(...) wrapping _request per PATCH2 §P2-4. _request stays private; signature unchanged. |
additive method only |
lark_client/audit.py |
YES (read this round) | EXTEND | Add 4 net-new methods + strict-writer variant _write_strict(entry, *, path); extend mask-skip list per PATCH2 §P2-3. Existing log_call / log_cli_invocation / _write unchanged in behaviour. |
additive only |
lark_client/exceptions.py |
YES (read this round) | EXTEND | Add 5 net-new subclasses of LarkClientError. Existing 5 classes untouched. |
additive only |
cli/lark_tool.py |
YES (read this round) | EXTEND | Import + register new records subparser; add elif args.command == "records": branch in main(). Existing registry/schema/audit branches untouched. Exit-code constants unchanged. |
additive only |
config/allowed_endpoints.yaml |
YES (read this round) | EXTEND | Append 6 entries to the existing write: [] (currently empty list). Schema = {method, path, description}; resolved this round. |
append-only |
tests/test_core.py |
YES (read this round) | EXTEND | Add @pytest.mark.integration to exactly two tests (test_token_obtainable, test_credential_source_logged). No body modification. |
decorator-only |
pyproject.toml |
YES | EXTEND | Add [tool.pytest.ini_options] markers = ["integration: live Lark API tests …"]. Entry-point cli.lark_tool:main stays. |
additive only |
B. Net-new files — CREATE
| Path | Existing? | Purpose | Owner/lifecycle | Justification |
|---|---|---|---|---|
lark_client/service.py |
NO | Application Service Layer (PATCH2 §B). Single write entrypoint for CLI + future MCP. | Sprint 1 owns; permanent. | PATCH2 §B requires; no parallel write logic. |
lark_client/safety.py |
NO | SafetyLayer (PATCH2 §C — 8 layers). | Sprint 1 owns; permanent. | PATCH2 §C requires. |
lark_client/approval.py |
NO | ApprovalProvider ABC + YamlApprovalProvider. |
Sprint 1 owns; permanent. Sprint 3 adds Directus provider here. | PATCH2 §C.3 DI requires. |
lark_client/gpg_backup.py |
NO | GPG public-key encryption + sidecar meta + storage layout. | Sprint 1 owns; permanent. | PATCH2 §E requires; OQ-2 (no private key on VPS). |
lark_client/pii.py |
NO | FieldPIIRegistry (whitelist) + PatternPIIDetector (regex). |
Sprint 1 owns; permanent. | PATCH2 §D requires. |
lark_client/factory.py |
NO | DI composition root: builds LarkWriteService from (core, registry, audit, approval, gpg, pii, safety, limits). |
Sprint 1 owns; permanent. | PATCH2 §C.3 requires SafetyLayer to NOT import YamlApprovalProvider; needs composition root. |
lark_client/writer.py |
NO | Typed thin façade over LarkWriteService (PATCH1 §G.1). |
Sprint 1 owns; permanent. | PATCH1 §G.1 / PATCH2 carry-forward. |
cli/records.py |
NO | argparse records subparser + per-subcommand handlers + dispatch helper handle(args, *, svc). |
Sprint 1 owns; permanent. | PATCH2 §P2-1 requires. |
config/write-approvals.yaml |
NO | Approval registry (id, op, scope, used flag, expires_at, created_by). | Sprint 1 seeds with empty list. | PATCH1 §G.4 / PATCH2 §C.3. |
config/pii-fields.yaml |
NO | PII field whitelist seeded from S176 snapshots. | Sprint 1 seeds; growable by PR. | PATCH2 §D.1. |
config/lark-api-limits.yaml |
NO | rate + batch + write_endpoint_options. |
Sprint 1 seeds; permanent. | PATCH2 §B.4 / §P2-10. |
tests/conftest.py |
NO (R0 confirmed) | LARK_TEST_INTEGRATION env gate + assert_buffer_base_token helper + pytest_collection_modifyitems to skip non-gated integration tests. |
Sprint 1 owns; permanent. | PATCH2 §P2-5 requires. |
tests/test_service.py |
NO | Unit tests for service layer (T1, T2, T8, T9, mock LarkCore.write). |
Sprint 1 owns; permanent. | PATCH2 §H.2. |
tests/test_safety.py |
NO | Unit tests for SafetyLayer order, dry-run gate, layer-failure modes (T3, T11, T12, lock, rate-limit interactions). | Sprint 1 owns; permanent. | PATCH2 §H.2. |
tests/test_approval_yaml.py |
NO | Unit tests for YamlApprovalProvider atomicity (T5, T5b concurrent consumers via threads/processes). |
Sprint 1 owns; permanent. | PATCH2 §C.3. |
tests/test_audit_extension.py |
NO | Unit tests for 4 new audit methods, masking carve-out, fsync, emergency sink (T12), orphan log (T11). | Sprint 1 owns; permanent. | PATCH2 §P2-3. |
tests/test_core_write.py |
NO | Unit tests for LarkCore.write whitelist check, client_token insertion when supported, refusal when not whitelisted, no-requests-outside-core lint test. |
Sprint 1 owns; permanent. | PATCH2 §P2-4. |
tests/test_records_cli.py |
NO | Unit tests for argparse records parser shape, exit-code routing (T8, T10, T10b), JSON output to stdout, error JSON to stderr. |
Sprint 1 owns; permanent. | PATCH2 §P2-1 + §P2-2. |
tests/test_pii.py |
NO | Unit tests for FieldPIIRegistry, PatternPIIDetector (CCCD/CMND/passport/phone/bank/email), egress block (T10b). |
Sprint 1 owns; permanent. | PATCH2 §D + §P2-5. |
tests/test_batch.py |
NO | Unit tests for batch chunking, ceiling rejection, partial-failure rollup (T6, T6b, T7). | Sprint 1 owns; permanent. | PATCH2 §B.4 / OQ-5. |
tests/test_records_integration.py |
NO | Gated integration tests against Base đệm only — runs T1–T12 + b-variants against real Lark. Skipped unless LARK_TEST_INTEGRATION=1. |
Sprint 1 owns; permanent. | PATCH2 §H.3, gated commit-non-blocking. |
Total file count: existing modified = 7 (all additive); new = 21 (7 library + 1 CLI + 3 config + 1 conftest + 9 test files).
§4. Exact implementation sequence
Sprint 1 is implemented in eight numbered phases, each ending in a verification gate. No phase advances until its gate is green. Phases 1–3 are read-only / configuration / test-infra. Phases 4–7 are code. Phase 8 is integration.
Phase 1 — Pre-code micro-tasks (BLOCK opens)
1.1 Read config/allowed_endpoints.yaml to confirm the schema observed this round (already done in package authoring — version: 1; read: [{method, path, description}, …]; write: []). Record the inline confirmation in a new commit-bundled note.
1.2 Resolve OQ-8 — Lark Open API client_token per-endpoint support — by citation:
- Read official Lark Open API documentation for each of the six record-class endpoints (§16).
- Record an authoritative citation block (
source: docs.lark.com/...ordocs.larksuite.com/..., retrieval date, response field name) in a short KB note:knowledge/dev/lark/s177-controlled-crud-gateway/s177-sprint1-oq8-client-token-citation-2026-05-XX.md. (Path date is the sprint-start date.) - Where docs are silent, treat the endpoint as
client_token_supported: false(conservative default) and note the conservatism in the citation file. - No Lark write probe is permitted at this phase. A non-mutating Base-đệm read probe (e.g. token endpoint round-trip) is permitted only if needed to verify rate-limit / auth still works — that is read, not write. 1.3 Quarantine commit: open the conftest + decorator-only changes (§5 Phase 2) on a Sprint 1 branch; commit, but do not merge.
Phase 1 gate: OQ-8 citation file uploaded to KB; OQ-9 confirmed inline (already resolved this round); no code yet.
Phase 2 — Test-harness quarantine
2.1 Add tests/conftest.py per §15.
2.2 Add pyproject.toml [tool.pytest.ini_options] integration marker.
2.3 Annotate exactly two existing tests:
tests/test_core.py :: test_token_obtainable→ add@pytest.mark.integrationabove itsdef.tests/test_core.py :: test_credential_source_logged→ same. Three other tests (test_whitelist_blocks_im,test_whitelist_blocks_delete,test_whitelist_allows_list_tables) stay default — they are local-only (whitelist check, no Lark call). 2.4 Re-runpytestcold (no env vars). Confirm: 3 tests pass, 2 tests skipped. 2.5 Re-runLARK_TEST_INTEGRATION=1 pytest -m integration. Confirm: 2 tests pass against the live token endpoint.
Phase 2 gate: cold pytest = 3 pass + 2 skipped; gated pytest = 2 pass; 0 regressions in non-Sprint suites.
Phase 3 — Config seeds + endpoint whitelist append
3.1 Append the 6 record-class entries to config/allowed_endpoints.yaml :: write: (§16 — exact YAML).
3.2 Create config/lark-api-limits.yaml (§17 — exact YAML).
3.3 Create config/write-approvals.yaml with approvals: [] and approval_exempt_bases: ["88-phai-cu-base-dem"] (§13).
3.4 Create config/pii-fields.yaml with empty field_registry: [] and seed entries from one snapshot row to prove schema validity (§18).
3.5 Re-run pytest cold. Confirm: existing whitelist tests still pass (they check that non-whitelisted endpoints raise EndpointNotAllowed — adding entries does not regress them).
3.6 Add new test tests/test_core_write.py :: test_whitelist_loads_write_section — confirms the appended write endpoints are now in core._allowed_endpoints and match via regex.
Phase 3 gate: all configs valid YAML; whitelist test additions green; no live-API calls.
Phase 4 — Exceptions + LarkCore.write
4.1 Add 5 new exception classes to lark_client/exceptions.py (§11).
4.2 Add public method LarkCore.write(...) to core.py (§15).
4.3 Add unit tests tests/test_core_write.py for: whitelist refusal, client_token insertion when supported, client_token NOT inserted when unsupported, retry passthrough.
4.4 Add lint test (in the same file) that greps the repo for _request( outside lark_client/core.py and import requests outside lark_client/core.py and fails on any match.
Phase 4 gate: all new core/exception tests green; lint test green; no other tests regress.
Phase 5 — AuditLogger extension
5.1 Add _write_strict(entry, *, path=None) to audit.py — opens its own fd, os.fsync, raises AuditWriteError(phase) on any I/O error. Does not catch and warn.
5.2 Add 4 new public methods: log_write_planned, log_write_result, log_write_emergency, log_orphan_backup — each builds the entry shape defined in PATCH2 §P2-3, then calls _write_strict to the appropriate sink:
log_write_planned/log_write_result→ today's primary JSONL.log_write_emergency→/var/log/lark-ops/EMERGENCY/<YYYYMMDD>/<ts>-<idempotency_key>.json(one file per call, separate fd).log_orphan_backup→/var/log/lark-ops/orphan-backups.log. 5.3 Extend_write(and_write_strict) skip-list from{ts, agent, cmd}to the extended set per PATCH2 §P2-3 (idempotency_key, operation_id, audit_pre_id, backup_ref, backup_path, key_fingerprint, request_id, approval_id, base_key, table_id, phase, op, outcome_status, target_count, duration_ms, dry_run, confirmed, is_buffer_base). 5.4 Unit tests intests/test_audit_extension.py: planned-fsync-fail aborts (returns raised exception, no second-write); result-fsync-fail-after-success calls emergency path; orphan-log path is metadata-only; mask-skip honors carved-out keys; emergency uses separate file/fd.
Phase 5 gate: all audit-extension tests green; existing test_token_obtainable (when integration gate on) still finds log_call records as before (no regression).
Phase 6 — Approval + GPG + PII modules
6.1 lark_client/approval.py:
ApprovalProviderABC withcheck_and_consume(ctx) → ApprovalDecision.YamlApprovalProvider(path="config/write-approvals.yaml"):- acquire
fcntl.LOCK_EXon the file for the WHOLE check→consume→rewrite sequence; - re-read file inside the lock;
- if
used==falsefor the matched approval → setused=true, used_by=ctx.agent, used_at=ts, idempotency_key=ctx.idempotency_key, atomic write viatmp+fsync+rename; - if
used==true→ raiseApprovalError("already_consumed"); - lock contention → bounded 5s wait →
ApprovalError("approval_locked"); - reusable-within-expiry: validate under lock, do NOT mutate.
6.2
lark_client/gpg_backup.py:
- acquire
GPGBackup(pubkey_pem: bytes, writes_dir: Path):- encrypt + sidecar meta;
- path naming per PATCH1 §E.3 (
<base>__<table>__<record>__<idempotency>__pre.json.gpg+.meta.json); - fsync both before returning;
- returns
(backup_ref: str, key_fingerprint: str).
LARK_BACKUP_GPG_PUBKEYis fetched viaLarkCore._gsm_get("LARK_BACKUP_GPG_PUBKEY")at composition-root level (factory.py), then passed as bytes toGPGBackup.__init__. No direct GSM access fromgpg_backup.py. 6.3lark_client/pii.py:FieldPIIRegistry(path="config/pii-fields.yaml")— loaded once at composition.PatternPIIDetector— regex set from PATCH1 §D.2.- Both return
(types: set[str], hit_field_ids: set[str])— never substrings. pii_scan(record_dict, base_key, table_id) → (redaction_types, redacted_fields_count, detector_used)— pure function used by SafetyLayer layer 7. 6.4 Unit tests intests/test_approval_yaml.pyandtests/test_pii.py. Approval atomicity test T5b usesmultiprocessing.Process(not threads, because the GIL hides race conditions) to launch ≥2 concurrent consumers; exactly one returns success, the rest returnalready_consumed.
Phase 6 gate: all approval / GPG / PII unit tests green; T5b (concurrent consumers) green; egress-block test (T10b for stdout / --export path) green.
Phase 7 — SafetyLayer + Service + Writer + Factory
7.1 lark_client/safety.py:
- single public method
SafetyLayer.guard(ctx, payload, api_call). - enforces 10-step order from PATCH1 §C.1 — exactly the order in PATCH2 §A architecture diagram.
- raises
SafetyViolation/ApprovalError/AuditWriteErrorper PATCH2 §P2-8 routing table. - on layer-3 backup-then-layer-4-audit-pre-fail: calls
audit.log_orphan_backup(...)BEFORE re-raising — never auto-deletes (OQ-7). - on layer-9 audit-post-fail: calls
audit.log_write_emergency(...), returnsWriteOutcome(status="success", error="audit_post_degraded"). - locks:
per-record(lark-write:{base_key}:{table_id}:{record_id})fcntlexclusive; global rate lock isLarkCore._acquire_rate_limitinvoked from insideLarkCore.write— SafetyLayer does NOT re-acquire it (avoids double-take). 7.2lark_client/service.py: LarkWriteService(*, core, registry, safety):create_record / batch_create_records / get_record / update_record / batch_update_records / delete_record / batch_delete_records.- Service resolves
base_key → app_tokenviaregistry.get_by_key(...); wrapsKeyError→UnknownBaseError. - Each method builds the per-call closure
lambda: core.write(method, endpoint_template_filled, json_data=..., idempotency_key=ctx.idempotency_key, client_token_supported=<per-endpoint flag from lark-api-limits.yaml>, _audit_cmd=ctx.operation)and passes it tosafety.guard(ctx, payload, api_call). - Batches are chunked per
config/lark-api-limits.yaml :: batch.*ceilings; chunk failure →PartialFailureErrorwithcommitted[],failed[],rollback_command. 7.3lark_client/writer.py— minimalLarkWritertyped façade callingLarkWriteService. Kept stable for future CLI changes. 7.4lark_client/factory.py: build_default_service(agent: str) → LarkWriteServiceperforms DI wiring:core = LarkCore(agent=agent)→audit = core._audit(reuse) →registry = Registry.load()→pubkey = core._gsm_get("LARK_BACKUP_GPG_PUBKEY")→gpg = GPGBackup(pubkey, ...)→pii = pii.load_from_yaml()→approval = YamlApprovalProvider(...)→limits = load_yaml("config/lark-api-limits.yaml")→safety = SafetyLayer(audit=audit, gpg=gpg, pii=pii, approval=approval, limits=limits)→service = LarkWriteService(core, registry, safety).factory.pyis the ONLY place that importsYamlApprovalProvider.safety.pyimports the abstractApprovalProvideronly. 7.5cli/records.py:register(subparsers)adds therecordsgroup with subcommandscreate / get / update / delete / batch-create / batch-update / batch-delete.handle(args)is the dispatch entry called fromcli/lark_tool.py main()'selif args.command == "records":branch.- Each
cmd_records_*returns the exit code (PATCH2 §P2-2 routing table). The try/except chain matches the order in PATCH2 §P2-1 illustrative code. cli/records.pyalways buildssvc = factory.build_default_service(agent=_get_agent(args))lazily insidehandle()— no module-level Lark calls.
Phase 7 gate: every unit test green (T1, T2, T3, T4-mocked, T5, T5b, T6, T6b, T7, T8, T9, T10, T10b, T11, T12). LARK_TEST_INTEGRATION still NOT set; no live calls fired. Existing 5 tests still pass per Phase 2 + Phase 3 expectations.
Phase 8 — Gated integration on Base đệm
8.1 With LARK_TEST_INTEGRATION=1 and a known-good write-approvals.yaml populated with Base-đệm test approvals (created by a human, not Sprint 1 code), run tests/test_records_integration.py:
- T1 dry-run on Base đệm → no API call,
dry_runoutcome. - T2 create real → GPG backup exists in
writes/<date>/; audit planned + audit result both present;request_idcorrelates the transport-levellog_call. - T3 update on Base đệm without
--confirm→ aborts withSafetyViolation("confirm_required"). - T4 update on Base đệm with
--confirm→ backup exists; record updated; rollback command printed. - T5 reuse of consumed one-time approval →
ApprovalError("already_consumed"). - T5b two concurrent CLI invocations with the same one-time approval → exactly one succeeds, the other returns
ApprovalError("already_consumed")(Lark API call count: exactly 1). - T6 batch_create 600 with
batch.record_create_max=500→ chunks 500+100, both audited. - T6b batch_delete 150 with ceiling 100 (no
--allow-chunkflag) →SafetyViolation("over_ceiling"); with--allow-chunk→ chunks 100+50. - T7 batch where one chunk fails →
PartialFailureError; no auto-rollback; rollback command printed. - T8 scope-mismatched approval →
ApprovalError("scope_mismatch"). - T9 wildcard delete approval →
ApprovalError("wildcard_forbidden"). - T10 PII payload on Base đệm guarded write → proceeds; audit shows
pii_redacted=true; raw body only in.json.gpg. - T10b PII payload to
--export/ stdout →SafetyViolation("pii_egress_blocked"). - T11 audit-pre intentionally unwritable (chmod 000 the dated JSONL) →
AuditWriteError("audit_pre_failed"); backup landed; orphan log entry appended. - T12 audit-post intentionally unwritable (chmod 000 after API call) → API succeeds; emergency JSON written under
EMERGENCY/; CLI exits 0 withWriteOutcome.error="audit_post_degraded". 8.2 Existing 5 tests intest_core.pystill pass cold (3) and with gate on (2 + 3 = 5). 8.3 Final sweep:git statusshows only intended files (no stray artifacts);git diff --statexactly matches the planned file list (§3).
Phase 8 gate: all T1–T12 green on Base đệm; no production base touched; integration log captured to KB as s177-sprint1-integration-evidence-2026-05-XX.md.
Sprint 1 exit: all 8 phase gates green → Sprint 1 sign-off readback to GPT/User; design commit + write-endpoint diff staged for OQ-6 actor.
§5. Safety gates BEFORE implementation begins
These gates must all be GREEN before Phase 1 starts. They are pre-implementation, KB-side:
- PATCH2 accepted by GPT/User. (This package presupposes PATCH2 acceptance.)
- OQ-6 (commit / repo path mechanics) status known — at minimum, the actor performing Sprint 1 has shell + repo write to
/opt/incomex/lark-client/. The actor identity does NOT have to be the same as the package author. - OQ-8 citation step (§Phase 1.2) authorized to proceed — this is a read-only doc-citation task; no Lark write involved.
- OQ-9 already resolved (this package, §16).
- Sprint 1 command-review package + checklist + README update accepted by GPT/User (this round).
- The Sprint 1 actor confirms: no production Lark write will be issued; integration tests target Base đệm only; the actor has read PATCH2 §P2-2 (exit-code routing) and §P2-3 (audit ordering invariants).
LARK_BACKUP_GPG_PUBKEYGSM secret exists and is ASCII-armored. If absent: STOP_AND_ESCALATE to operator — Sprint 1 cannot proceed without a usable public key. (Secret name only is referenced here. Sprint 1 must NOT log the key contents.)
§6. Unit test plan (commit gate — must be green at every commit)
Every test file listed in §3.B (test_*) is unit/mock except test_records_integration.py.
Mocking strategy:
LarkCore.writeis mocked in service/safety unit tests viaunittest.mock.patchagainst the bound method. Return{"code": 0, "msg": "ok", "data": {"record": {"record_id": "rec_MOCK"}}}for happy path; raiseLarkAPIErrorfor negative paths.LarkCore._requestis mocked intest_core_write.pyto verifywrite()correctly preparesjson_data(with/withoutclient_token) and the whitelist check fires.- Filesystem fixtures use
tmp_pathfor/var/log/lark-ops/...substitutes — bothAuditLoggerandGPGBackupacceptlog_dir=/writes_dir=overrides for testability. - GPG: tests use a throwaway test public key (committed under
tests/fixtures/test-gpg-pub.asc) — a real test-only key, never the productionLARK_BACKUP_GPG_PUBKEY. Decryption isn't tested locally (no private key on VPS — OQ-2); integration evidence is that the file is GPG-encrypted (header check) + signed-into the backup dir. - Approval atomicity (T5b):
multiprocessing.Processfork — see §Phase 6.4. - PII patterns: corpus of synthetic CCCD / phone / passport / bank strings in
tests/fixtures/pii_corpus.py.
Pass criteria per test (sample, not exhaustive):
| Test ID | File | What it asserts | Pass if |
|---|---|---|---|
| T1 | test_safety.py | dry-run default ON; layer 1 returns status=dry_run |
no API call; no backup; no audit-pre |
| T2 | test_records_integration.py (gated) | create real on Base đệm | record_id returned; backup file present; planned + result entries in audit; request_id shared |
| T3 | test_safety.py | update without confirm | raises SafetyViolation("confirm_required"); no API call |
| T4 | test_records_integration.py | update with confirm + backup | backup .json.gpg exists; rollback command points to it |
| T5 | test_approval_yaml.py | consumed one-time approval | second check_and_consume raises ApprovalError("already_consumed") |
| T5b | test_approval_yaml.py | 4 concurrent processes consuming same one-time approval | exactly one returns success; three return already_consumed; YAML used flag flips exactly once |
| T6 | test_batch.py | batch_create 600 | service splits 500+100; 2 guarded calls; 2 audit pairs |
| T6b | test_batch.py | batch_delete 150 with ceiling 100 | rejected with SafetyViolation("over_ceiling") unless --allow-chunk set |
| T7 | test_batch.py | partial failure | PartialFailureError with committed/failed lists; no auto-rollback |
| T8 | test_records_cli.py | scope mismatch | ApprovalError("scope_mismatch") → exit 4 |
| T9 | test_records_cli.py | wildcard delete | ApprovalError("wildcard_forbidden") → exit 4 |
| T10 | test_safety.py + test_pii.py | PII payload guarded write | proceeds; audit has pii_redacted=true; raw value never in audit entry |
| T10b | test_pii.py + test_records_cli.py | PII to --export / stdout |
SafetyViolation("pii_egress_blocked") → exit 1 |
| T11 | test_audit_extension.py + test_safety.py | audit-pre fail | AuditWriteError("pre") → exit 3; backup retained; orphan log entry appended |
| T12 | test_audit_extension.py + test_safety.py | audit-post fail after API success | emergency sink written; outcome success + error="audit_post_degraded" → exit 0 |
Coverage targets: ≥ 95% line for new modules (service.py, safety.py, approval.py, gpg_backup.py, pii.py, factory.py, cli/records.py); ≥ 90% branch for safety.py.
Commit gate: all unit/mock tests green. Integration green is not a commit gate but is a Sprint 1 sign-off gate.
§7. Integration test plan (gated by LARK_TEST_INTEGRATION=1)
tests/test_records_integration.py — every test:
- Top of file:
pytestmark = pytest.mark.integration. Conftest skips when env-gate off. - Every test calls
assert_buffer_base_token(token)from conftest BEFORE the first guarded write. Production token is rejected withAssertionError. - Uses fixture
buffer_base()that constructs aWriteContext(base_key="88-phai-cu-base-dem", agent="claude-code", …)and returns it together with a realLarkWriteServiceinstance. - Approvals for the test run are written to a test-scoped
config/write-approvals.yamlvia fixture (or to an override path passed into factory) — never edited by the test itself at runtime in a way that hides race conditions. - After T2 / T4 / T6 / T6b run, the test asserts:
request_idshared between the audit-planned entry, the innerlog_callentry from_request, and the audit-result entry;idempotency_keyis present in planned/result/backup/rollback;- the rollback command references the encrypted backup path, not raw record body;
- the Base đệm token literal matches the registered token (
Nf2bb1ExXaYnlksgoyQl72GNgAc).
Production smoke is forbidden. The integration suite cannot target the production token (YSIkb8PxOaNaozs2vwalOOcagkf); even with LARK_TEST_INTEGRATION=1, assert_buffer_base_token rejects it.
§8. Base đệm hard guard
Implemented at three layers, all of which must agree before any integration write:
tests/conftest.py :: assert_buffer_base_token(token)—AssertionErrorif token !=Nf2bb1ExXaYnlksgoyQl72GNgAc.lark_client/registry.py(existing) —Registry.get_by_key("88-phai-cu-base-dem")returnsBaseEntry(role="staging", app_token="Nf2bb1ExXaYnlksgoyQl72GNgAc", …); integration tests prefer the role-based check (base.role == "staging") when iterating, but always also assert by literal token before any destructive op.- Sprint 2 MCP adapter (NOT Sprint 1) — adapter-boundary reject for delete/schema on non-Base-đệm tokens. Out of Sprint 1 scope; mentioned here so the design contract is visible.
For Sprint 1 (CLI only), the Base đệm guard fires in CLI handler:
delete/batch-delete/ future field/table-delete subcommands → CLI checksif base.role != "staging" and not args.confirm_production→ reject. Sprint 1 ships no--confirm-productionflag for delete (per req §11). Production delete must come via a separately-approved path in a later Sprint.
§9. Exit code mapping
Per PATCH2 §P2-2 routing table (unchanged in Sprint 1). The CLI does NOT extend the existing enum. Mapping reproduced here for the implementer:
| Bucket | Exit code | Examples |
|---|---|---|
| 0 OK | success; success-with-degraded-audit (T12) | dry_run, success |
| 1 USER_ERROR | bad flags; unknown base; PII egress blocked | SafetyViolation("confirm_required" / "agent_required" / "pii_egress_blocked"), UnknownBaseError |
| 2 NETWORK_API | Lark Open API failure after retries | LarkAPIError |
| 3 INTERNAL | partial failure; audit-pre fail; rate-limit lock exhausted; generic LarkClientError | PartialFailureError, AuditWriteError("pre"), RateLimitExceeded, TokenRefreshError, unmapped LarkClientError, unknown exception |
| 4 PERMISSION_CONFIG | approval refusal; endpoint not whitelisted | ApprovalError(*), EndpointNotAllowed |
| 5 CRED_LOST | credential / GSM access lost | CredentialPermissionLost |
Reserved invariants:
- code 5 NEVER returned for partial failure or audit failure.
- audit-post degraded (success + audit-post fail) → 0 with
WriteOutcome.error="audit_post_degraded"— script consumers must parse the JSON to detect this.
§10. Exception mapping (net-new classes Sprint 1)
All subclass LarkClientError (existing base, confirmed in source). Added to lark_client/exceptions.py — five classes:
class ApprovalError(LarkClientError):
"""Approval registry refused. .code ∈ {missing, expired, scope_mismatch,
wildcard_forbidden, already_consumed, approval_locked}."""
class SafetyViolation(LarkClientError):
"""SafetyLayer policy violation pre-API. .reason ∈ {dry_run_required,
confirm_required, agent_required, audit_pre_failed, lock_held,
pii_egress_blocked, pii_scanner_error, over_ceiling}."""
class PartialFailureError(LarkClientError):
"""Batch partially committed. .committed[], .failed[], .rollback_command."""
class AuditWriteError(LarkClientError):
"""Audit sink write failed. .phase ∈ {pre, post, emergency, orphan}."""
class UnknownBaseError(LarkClientError):
"""Wraps Registry KeyError at service boundary. .base_key."""
These five are the complete Sprint 1 net-new exception set. Do not invent additional exception names.
§11. Audit 2-phase implementation plan
Implementation order inside SafetyLayer.guard(ctx, payload, api_call):
1. dry-run gate
2. approval.check_and_consume(ctx) # raises ApprovalError if refused
3. if op ∈ {update, delete}: backup = gpg.encrypt_and_store(...)
# writes file + sidecar + fsync;
# returns (backup_ref, key_fingerprint)
4. audit_pre_id = audit.log_write_planned(ctx, backup_ref=backup_ref or None,
audit_pre_id=str(uuid4()))
# APPEND + FSYNC; raises AuditWriteError("pre")
# on any I/O failure; if a backup was created at
# step 3, on raise we MUST call
# audit.log_orphan_backup(...) BEFORE propagating
5. acquire per-record lock
6. (rate limit is acquired inside LarkCore.write — do NOT re-acquire here)
7. pii_meta = pii.scan(ctx, payload) # SafetyViolation if egress path detected
8. try:
response = api_call() # calls LarkCore.write(...)
except LarkAPIError as e:
outcome = WriteOutcome(status="failed", ...)
audit.log_write_result(ctx, audit_pre_id=audit_pre_id, outcome=outcome)
raise
9. outcome = WriteOutcome(status="success", request_id=..., pii=pii_meta, ...)
try:
audit.log_write_result(ctx, audit_pre_id=audit_pre_id, outcome=outcome)
except AuditWriteError:
# API already succeeded — never re-trigger
audit.log_write_emergency(ctx, audit_pre_id=audit_pre_id, outcome=outcome,
error="audit_post_degraded")
outcome = outcome._replace(error="audit_post_degraded")
10. finally: release locks
return outcome
Invariants restated for the implementer:
audit.log_write_plannedreturns itsaudit_pre_idonly on a fully-fsynced write. Any raise from it skips API call.audit.log_write_resultmust NOT swallow on post-API success — emergency path is mandatory.- Both
log_write_plannedandlog_write_resultcarry the sameidempotency_keyandaudit_pre_id(correlation). - The transport-level
log_callinside_requestcontinues to fire and adds a third audit entry per write; it is NOT suppressed; it shares therequest_idUUID. Document this overlap in the integration evidence file (Phase 8).
§12. GPG backup implementation plan
lark_client/gpg_backup.py:
class GPGBackup:
def __init__(self, pubkey_pem: bytes, writes_dir: Path):
self._pubkey_pem = pubkey_pem
self._fingerprint = _fingerprint(pubkey_pem)
self._writes_dir = writes_dir
def encrypt_and_store(self, *, ctx: WriteContext,
plaintext: bytes) -> tuple[str, str]:
"""
- Build path: writes_dir/<YYYYMMDD>/<base>__<table>__<record>__<idempo>__pre.json.gpg
- Encrypt plaintext with self._pubkey_pem (ASCII-armored).
- Write file; fsync.
- Write sidecar .meta.json with {fingerprint, ts, op, base_key, table_id,
record_id, idempotency_key} — NO raw PII.
- Fsync sidecar.
- Returns (backup_ref_path_str, key_fingerprint).
"""
GPG implementation choice (Sprint 1): prefer python-gnupg library if it is already on the VPS (pip show python-gnupg). If not present, shell out to gpg --encrypt --recipient <fpr> --armor --output <out> via subprocess.run with check=True. The decision lands in the Sprint 1 implementation commit and is recorded in s177-sprint1-implementation-checklist-2026-05-19.md. Either path passes the encryption integration test (the file produced has the GPG ASCII-armored header -----BEGIN PGP MESSAGE-----).
Decryption is NEVER performed on the VPS in Sprint 1 (OQ-2). The unit test only verifies the encrypted file's header byte sequence. Integration test does the same — checks that the file exists and has the GPG header, not that it round-trips.
Public-key fetch: done once at composition root (factory.py), then passed by bytes to GPGBackup.__init__. gpg_backup.py itself never touches GSM.
§13. YAML approval atomic check-and-consume plan
config/write-approvals.yaml schema (Sprint 1):
version: 1
approval_exempt_bases:
- 88-phai-cu-base-dem # bypasses approval check only; all other 7 layers still apply
approvals:
# one entry per approval
- id: APR-001
operation: record.update # ∈ {record.create, record.update, record.delete,
# record.batch_create, record.batch_update,
# record.batch_delete}
scope:
base_key: 88-phai-cu-base-dem # required; no wildcard for delete/update
table_id: tblPQ6N79EeOmnTm # required for first-write; per PATCH1 §C.4
# record_id: rec_… # optional narrower scope
one_time_use: true
used: false # flips to true after consume
used_by: null
used_at: null
idempotency_key: null # set on consume
expires_at: 2026-06-01T00:00:00Z
reason: "S177 Sprint 1 T2 integration write"
created_by: huyen@incomex.com
created_at: 2026-05-19T10:00:00Z
YamlApprovalProvider.check_and_consume(ctx) algorithm (file-locked atomic):
1. Open config/write-approvals.yaml with O_RDWR.
2. fcntl.flock(LOCK_EX) — bounded 5s wait; on timeout → ApprovalError("approval_locked").
3. Re-read file inside the lock (yaml.safe_load on the open fd).
4. Find approval where id == ctx.approval_id.
- missing → ApprovalError("missing")
5. Validate scope:
- expires_at < now → ApprovalError("expired")
- scope.base_key != ctx.base_key → ApprovalError("scope_mismatch")
- scope.table_id and != ctx.table_id → ApprovalError("scope_mismatch")
- scope.record_id and != target record → ApprovalError("scope_mismatch")
- operation != ctx.operation → ApprovalError("scope_mismatch")
- wildcard policy:
- record.update / record.delete / field.* / table.*: scope.table_id MUST be set
else → ApprovalError("wildcard_forbidden")
6. If one_time_use:
- if used == true → ApprovalError("already_consumed")
- set used=true, used_by=ctx.agent, used_at=now, idempotency_key=ctx.idempotency_key
- serialize updated YAML to a tmp file (same directory), fsync, os.rename over original.
7. Release lock; return ApprovalDecision(approval_id, one_time, expires_at).
Atomicity proof (T5b): ≥ 2 concurrent processes call check_and_consume with the same approval_id. Process A acquires LOCK_EX, reads used=false, writes used=true, fsyncs, renames, releases lock. Process B (waiting on flock) acquires lock, re-reads file (now used=true), raises already_consumed. Single winner enforced by the OS file lock, not by Python.
§14. LarkCore.write implementation plan
Net-new public method on existing LarkCore class:
def write(
self,
method: str,
endpoint: str,
*,
json_data: dict | None = None,
params: dict | None = None,
timeout: int = _DEFAULT_TIMEOUT,
idempotency_key: str | None = None,
client_token_supported: bool = False,
_audit_cmd: str = "",
) -> dict:
"""Public guarded write entrypoint — only external caller is SafetyLayer."""
if client_token_supported and idempotency_key is not None:
# Lark accepts client_token via JSON body for the supported endpoints (per OQ-8 citation).
json_data = {**(json_data or {}), "client_token": idempotency_key}
# Delegates to existing private _request, which enforces whitelist + rate-limit + retry + audit.
return self._request(
method=method.upper(),
endpoint=endpoint,
json_data=json_data,
params=params,
timeout=timeout,
_audit_cmd=_audit_cmd,
)
Why this is safe:
_requestalready enforces_check_whitelist(method, endpoint). If an endpoint isn't inwrite:,_requestraisesEndpointNotAllowed. PATCH2 already maps that to exit 4._requestalready enforces rate limit, retry, token refresh,request_idgeneration._requestalready callsself._audit.log_call(...)on every result — the existing transport audit fires for guarded writes too (third audit entry per write — documented overlap).- No change to
_request, no change to_check_whitelist, no change to_acquire_rate_limit.
Lint test (tests/test_core_write.py :: test_no_request_or_requests_outside_core):
import pathlib, re
ROOT = pathlib.Path(__file__).resolve().parent.parent
banned = [r"\b_request\(", r"^\s*import\s+requests\b"]
allowed_files = {ROOT / "lark_client/core.py"}
for p in ROOT.rglob("*.py"):
if p in allowed_files or "tests" in p.parts:
continue
text = p.read_text(encoding="utf-8")
for pat in banned:
assert not re.search(pat, text, re.M), f"forbidden {pat!r} in {p}"
Sprint 1 may also extend the check to forbid core._request( (note the dot prefix) anywhere outside core.py — useful when typing.
§15. tests/conftest.py exact shape
"""S177 test harness conventions — env gate + Base đệm hard assert."""
import os
import pytest
BASE_BUFFER_KEY = "88-phai-cu-base-dem"
BASE_BUFFER_TOKEN = "Nf2bb1ExXaYnlksgoyQl72GNgAc"
def _integration_enabled() -> bool:
return os.environ.get("LARK_TEST_INTEGRATION") == "1"
@pytest.fixture(autouse=False)
def buffer_base():
return {"key": BASE_BUFFER_KEY, "app_token": BASE_BUFFER_TOKEN}
def assert_buffer_base_token(token: str) -> None:
if token != BASE_BUFFER_TOKEN:
raise AssertionError(
f"integration test refused: token {token!r} is not Base đệm "
f"({BASE_BUFFER_TOKEN!r}); production targets are forbidden."
)
def pytest_collection_modifyitems(config, items):
if _integration_enabled():
return
skip = pytest.mark.skip(reason="LARK_TEST_INTEGRATION!=1")
for item in items:
if "integration" in item.keywords:
item.add_marker(skip)
pyproject.toml additive entry:
[tool.pytest.ini_options]
markers = [
"integration: live Lark API tests; require LARK_TEST_INTEGRATION=1 + Base đệm token",
]
§16. allowed_endpoints.yaml changes — schema confirmed this round
Observed schema (DISCOVER-FIRST this round):
version: 1
read:
- method: POST
path: /open-apis/auth/v3/tenant_access_token/internal
description: Get tenant access token
- ...
write: []
Entries are structured objects with method, path, description; path uses {app_token}/{table_id} brace templates. LarkCore._load_whitelist iterates both sections and feeds each entry into _check_whitelist, which regex-matches via _path_to_pattern.
Sprint 1 additive entries — append to write::
write:
- method: POST
path: /open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records
description: Create one record (S177)
- method: POST
path: /open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/batch_create
description: Batch create records up to ceiling (S177)
- method: PUT
path: /open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/{record_id}
description: Update one record (S177)
- method: POST
path: /open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/batch_update
description: Batch update records up to ceiling (S177)
- method: DELETE
path: /open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/{record_id}
description: Delete one record (S177)
- method: POST
path: /open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/batch_delete
description: Batch delete records up to ceiling (S177)
No schema extension required. client_token support is declared in the sibling config/lark-api-limits.yaml :: write_endpoint_options: (§17) — allowed_endpoints.yaml stays at three fields.
§17. lark-api-limits.yaml handling
New file at config/lark-api-limits.yaml:
version: 1
rate:
requests_per_sec: 10 # matches existing LarkCore._MAX_RPS
batch:
record_create_max: 500
record_update_max: 500 # PATCH2 conservative default; S178 indicates Lark may allow 1000
record_delete_max: 100 # OQ-5 conservative; raise only after official-doc citation or
# Base đệm non-mutating probe
notes:
record_delete_max: "PATCH2 conservative default until OQ-5 closure with citation/probe."
write_endpoint_options:
# endpoint_path_template -> per-endpoint metadata. Filled by Sprint 1 Phase 1.2 (OQ-8 citation).
# Defaults below are conservative until citation lands.
/open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records:
methods: ["POST"]
client_token: true # widely documented; verified at Phase 1.2
/open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/batch_create:
methods: ["POST"]
client_token: true # widely documented; verified at Phase 1.2
/open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/{record_id}:
methods: ["PUT", "DELETE"]
client_token: false # conservative; flip to true only with citation
/open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/batch_update:
methods: ["POST"]
client_token: true # widely documented; verified at Phase 1.2
/open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/batch_delete:
methods: ["POST"]
client_token: false # conservative; flip to true only with citation
LarkCore reads write_endpoint_options optionally. Absence means all whitelisted writes are client_token: false — i.e. no behavior regression. Service layer reads the file at factory time and passes the per-endpoint flag through to LarkCore.write(..., client_token_supported=…).
OQ-8 closure step: Phase 1.2 doc-citation MAY flip the false entries to true — but only via a recorded citation under knowledge/dev/lark/s177-controlled-crud-gateway/s177-sprint1-oq8-client-token-citation-2026-05-XX.md. The citation file is itself part of the Sprint 1 commit.
§18. PII handling
config/pii-fields.yaml (Sprint 1 seed):
version: 1
field_registry:
# one entry per (base_key, table_id, field_id) known to carry PII
# Example seed — to be expanded from S176 snapshots in Phase 3.4
- base_key: 88-phai-cu-base-dem
table_id: tblPQ6N79EeOmnTm
field_id: fldPLACEHOLDER
pii_types: ["national_id_cccd"]
note: "Seed entry — replace with real fld_ from snapshot review"
PatternPIIDetector regex set from PATCH1 §D.2 (national_id_cccd \b\d{12}\b, national_id_cmnd \b\d{9}\b, passport, phone_vn, bank_account, email). The detector returns types + counts only, never substrings, never positions.
PII policy (OQ-3 folded):
- Guarded record write: PII detection → log metadata to audit (
pii_redacted: true, redaction_types: […], redacted_fields_count: N, detector: [registry|pattern]). Write proceeds. No PII raw value in audit, ever. - Egress path (any of):
--export,--out, stdout dump, non-GPG file write, log line. PII detection →SafetyViolation("pii_egress_blocked")→ CLI exit 1. T10b verifies. - Rollback command: always references the encrypted backup path. Never includes record body.
--pii-strict: NOT shipped in Sprint 1 (OQ-3 deferred).
§19. Orphan backup handling (OQ-7 folded)
No inline auto-delete. PATCH1 §C.6 step 1 ("delete-if-safe") is removed; only the "leave + log" path remains.
Sprint 1 implementation:
SafetyLayerlayer-4 (audit-pre) failure path:except AuditWriteError as e: if backup_ref is not None: audit.log_orphan_backup( ctx, backup_path=backup_ref, key_fingerprint=key_fingerprint, reason="audit_pre_failed", ) raise SafetyViolation("audit_pre_failed") from e- The
.gpgfile STAYS inwrites/<YYYYMMDD>/. Sprint 1 ships zeroos.remove/unlink/shutil.rmtreecalls against any path underwrites/. - The orphan log line is metadata-only (no raw PII).
- The sweep command (
lark-tool audit orphan-backups list / sweep) is Sprint 4, not Sprint 1.
Operator runbook (Sprint 1 deliverable, KB-only — not a CLI command):
- An operator listing
/var/log/lark-ops/orphan-backups.logcan reconcile orphan blobs vs the audit-pre stream. - Grace window 7 days minimum before any eligible deletion (Sprint 4 sweep enforces this).
- Sweep is itself audited via SafetyLayer in Sprint 4.
§20. Rollback / cleanup plan (Sprint 1 implementation rollback)
If Sprint 1 commit ships and a defect surfaces:
- Code-level rollback:
git revert <sprint1-commit-sha>— all Sprint 1 changes are additive (new files + additive edits). Reverting restores the pre-Sprint-1 source. The 6 added entries inallowed_endpoints.yaml :: write:would be reverted too;LarkCore.writecalls would then failEndpointNotAllowed, naturally halting all guarded writes. - Config-only rollback: if only the YAML is faulty (e.g. wrong endpoint path template), revert the YAML alone — code stays put.
LarkCore.writethen refuses those endpoints. - In-flight outstanding state:
- Encrypted backups in
writes/stay (retained by design). - Audit JSONL stays (append-only, no rewrite).
- Approvals in
write-approvals.yamlwithused=truestay — operator may re-issue a fresh approval for a re-run. - Orphan log entries stay.
- Encrypted backups in
- Re-issue of a one-time approval after rollback: operator manually adds a new entry to
write-approvals.yaml. The previously-consumed entry is not re-flipped —used=trueis permanent state. - No Lark-side rollback. Sprint 1 introduces no auto-rollback (req §12.10). If a guarded write succeeded but the operator wants to undo, the rollback command printed alongside the success outcome must be issued manually — and is itself a separate guarded write going through SafetyLayer.
No service restart, no container rebuild, no MCP reload is needed for Sprint 1 rollback. All changes are file-level; lark-tool re-reads configs at every invocation.
§21. PASS / FAIL / BLOCKED criteria for the implementer
PASS (Sprint 1 implementation complete):
- All 21 net-new files + 7 modified files in §3 land as specified.
- All eight phase gates §4 are green.
- Cold
pytest(no env vars): 3 existing tests pass + 2 quarantined skipped + N new unit tests all pass. LARK_TEST_INTEGRATION=1 pytest -m integrationon Base đệm: 2 existing live tests pass + T1–T12+b-variants all pass.assert_buffer_base_tokenrejects production token in a forced test (negative coverage).- No
import requestsoutsidelark_client/core.py; no_request(call outsidecore.py. Lint test green. - No
os.remove/unlinkagainstwrites/paths anywhere in Sprint 1 code. - OQ-8 citation file uploaded to KB.
lark-tool records --help,lark-tool records create --helpetc. show the expected argparse output.lark-tool(cold, norecordsarg) behavior unchanged from pre-Sprint-1 — existingregistry/schema/auditpaths still pass coldpytest.
FAIL (any of):
- A unit test marked T1–T12 is red.
- A live API call fires during a cold
pytest. LarkCore._requestis called from outsidecore.py.- An
os.remove/unlinkagainstwrites/exists in Sprint 1 code. - A guarded write path bypasses any of the 8 SafetyLayer layers.
- The
write:whitelist accepts non-record-class endpoints in Sprint 1.
BLOCKED (Sprint 1 actor reports up-the-chain):
LARK_BACKUP_GPG_PUBKEYGSM secret absent — operator must seed before Sprint 1 proceeds.- OQ-8 citation step cannot produce authoritative source — escalate; conservative defaults stand.
- The actor lacks shell + repo write to
/opt/incomex/lark-client/(OQ-6 still open in actor's environment). - Base đệm integration test hits a token assertion (means the live
bases.yamldrifted from R0 — investigate first). - Lark Open API rejects a
client_tokenon an endpoint the citation said supported it — flip the flag tofalseand re-test.
§22. Forbidden actions during Sprint 1
The implementer must not:
- Run any Lark write API call against any base other than
88-phai-cu-base-dem(the staging Base đệm). - Create a new Lark bot, rotate the existing
cli_a785d634437a502fcredential, or create a new GSM secret beyondLARK_BACKUP_GPG_PUBKEY(which is already a required Sprint 1 prerequisite, not net-new in implementation). - Enable any MCP write tool, register a new MCP server, or modify the existing
@larksuiteoapi/lark-mcptopology. Sprint 2 owns MCP changes. - Deploy, restart, or rebuild any container or service.
git pushto a shared branch,git mergetomain,git taga release. The Sprint 1 commit lands on a feature branch and stays there pending PR review.- Bypass the
--no-dry-run+--confirmrequirement for update/delete on non-Base-đệm targets. - Add a
--confirm-productionflag (or any equivalent) to Sprint 1 CLI. Production delete is a separate later decision. - Hardcode
app_tokenanywhere outsidetests/andconfig/bases.yaml. - Skip the existing 5 tests in
test_core.py— only the 2 live-API tests get the@pytest.mark.integrationdecorator. - Touch
lark_client/__init__.pyto re-export the new exceptions unless an existing convention requires it (R0 evidence:__init__.pyexports core/reader/registry/audit; exceptions are imported fromlark_client.exceptionsdirectly elsewhere — keep this convention). - Create any S177 KB document outside
knowledge/dev/lark/s177-controlled-crud-gateway/. - Self-advance to Sprint 2 (MCP adapter) without separate authorization.
§23. Expected post-implementation report (shape)
After Sprint 1 lands, the implementer uploads a single KB report at:
knowledge/dev/lark/s177-controlled-crud-gateway/s177-sprint1-implementation-evidence-2026-05-XX.md
Required sections:
- Verdict: PASS / FAIL / BLOCKED.
- Phase gate matrix: 8 rows × {gate, outcome, evidence-link}.
- File diff summary:
git diff --statagainst the pre-Sprint-1 HEAD; cross-checked against §3 inventory. - Test matrix: cold
pytestresults;LARK_TEST_INTEGRATION=1results; lint test result; coverage figures for new modules. - OQ-8 closure: link to citation file; per-endpoint
client_tokenfinal values; any conservative default that stayedfalse. - OQ-9 closure: confirms the schema observed this round matches what landed.
- Audit evidence sample: one redacted JSONL planned/result pair + the matching transport-level
log_callrow (samerequest_id), plus a sample emergency JSON and a sample orphan-log line. - Backup evidence sample: a single
.gpgheader byte sequence (no decrypted content); sidecar.meta.jsonfor the same entry. - Forbidden-action confirmation: explicit "did not do" list mirroring §22.
- Next-step recommendation: Sprint 2 command-review package authorization request (separate task).
End of S177 Sprint 1 Command Review Package. PASS — package ready for GPT/User review. No code, no source mutation, no Lark write, no deploy, no push/merge/tag, no self-advance.