KB-3A09

S177 — Sprint 1 Round B Implementation Evidence (2026-05-20)

23 min read Revision 1
s177larksprint1round-bimplementationevidencepassb1b2b3b4

S177 — Sprint 1 Round B Implementation Evidence

Status: PASS — full B1+B2+B3+B4 completed in one session. Date: 2026-05-20 Authored by: Claude Code (Opus 4.7, 1M context) Pairs with:

  • s177-sprint1-round-b-addendum-2026-05-20.md (binding addendum)
  • s177-sprint1-round-a-implementation-evidence-2026-05-19.md (Round A baseline)
  • s177-sprint1-command-review-package-2026-05-19.md (base spec)
  • s177-architecture-design-2026-05-19-patch2.md
  • s177-sprint1-implementation-checklist-2026-05-19.md
  • s177-oq-decision-record-2026-05-19.md

1. Top-line result

  • PASS for all four addendum subset boundaries: B1, B2, B3, B4.
  • 15 files changed Round-B-only, +2,530 / −12 lines on the feature branch.
  • 4 commits stacked on top of Round A: 60e3d80 (B1) → 8527154 (B2) → 4b73375 (B3) → 1cf7901 (B4).
  • Cold pytest: 82 passed, 5 skipped, 0 failed. All 5 skipped are the live-API tests, still gated by LARK_TEST_INTEGRATION=1. Zero failures — including the 2 pre-existing CLI smoke --json failures that survived Round A (root-caused and fixed in B1: argparse subparser dest collision).
  • No live Lark API call during cold pytest.
  • No production data touched. No deploy. No push/merge/tag. No MCP write enablement. No new bot. No credential rotation. No secret printed.
  • config/bases.yaml SHA256 unchangedc063dc00c6b71a56d06435baba3019fcdd33d92999c770f8bc6d784866bdae5d before and after Round B (addendum §6).

2. Repo / branch / HEAD

Item Value
Branch feat/s177-sprint1-round-a (continued from Round A; NOT pushed)
Round A base 06b11c0
Round B B1 commit 60e3d80 — baseline fixes + request-leak cleanup
Round B B2 commit 8527154 — ApprovalProvider + tests
Round B B3 commit 4b73375 — SafetyLayer skeleton + tests
Round B B4 commit 1cf7901 — LarkWriteService + CLI records surface
Remote ops none — no push, no merge, no tag

3. Subset boundary B1 — baseline fixes + request-leak cleanup

3.1 Scope delivered

  • LarkCore.read(...) public read entrypoint added to core.py (mirrors LarkCore.write minus client_token injection; one additive method).
  • scripts/s179_probe.py refactored to call core.read(...) instead of core._request(...). Behavior unchanged — only the call surface moved to the sanctioned public path. Docstring reworded to avoid lint-regex false-positive.
  • Dropped scripts/ exclusion in test_no_request_or_requests_outside_core. The lint now applies to scripts/ too.
  • cli/lark_tool.py argparse fix. The pre-existing CLI smoke failures (test_registry_list_json, test_registry_show_json) were not caused by binary path precedence — they were a subparser dest collision. The shared-args helper redefined --json / --agent on each subparser, so argparse overwrote the top-level value with the subparser default (False/None) when the flag was given pre-subcommand (e.g. lark-tool --json registry list). Fix: pass default=argparse.SUPPRESS on the subparser-level definitions; both pre- and post-subcommand positions now work.
  • tests/test_cli_smoke.py path-pinned to .venv/bin/lark-tool (defense-in-depth so the stale system-wide /usr/local/bin/lark-tool cannot influence test outcomes).
  • WriteContext.source plumbing in lark_client/audit.py.
    • _MASK_SKIP_KEYS extended from 19 → 20 keys (added "source").
    • New WRITE_CONTEXT_SOURCES = ("cli", "mcp", "cron", "api") enum exposed.
    • New _coerce_source(ctx) helper — unknown values fall back to "cli" per addendum §5.
    • All four entry builders (_build_planned_entry, _build_result_entry, _build_emergency_entry, _build_orphan_entry) include source.
  • tests/test_round_b1.py — 12 new tests covering all of the above.

3.2 B1 cold pytest result

44 passed, 5 skipped in 2.77s

(Previously: 31 passed, 5 skipped, 2 failed — i.e., B1 closed the 2 pre-existing failures and added 12 new green tests.)


4. Subset boundary B2 — ApprovalProvider + tests

4.1 Scope delivered

lark_client/approval.py adds:

  • Approval dataclass — full row shape from write-approvals.yaml :: approvals[] with exempt_synthetic flag distinguishing real vs. exempt-bypass approvals.
  • ApprovalProvider(yaml_path):
    • is_base_exempt(base_key) — O(n) lookup against approval_exempt_bases.
    • get(approval_id) — read-only lookup; no mutation.
    • check_and_consume(approval_id, ctx, idempotency_key):
      • Exempt-base bypass: if ctx.base_key is in approval_exempt_bases, returns a synthetic Approval(id="EXEMPT", exempt_synthetic=True) without touching the YAML.
      • Validation: operation match, scope match (base_key/table_id/record_id), wildcard-forbidden on update/delete without explicit base_key, expiry check.
      • One-time-use atomic consume: takes an exclusive fcntl.flock on a sidecar .lock file (the YAML itself gets atomically swapped via os.replace, so the sidecar survives the swap); rewrites YAML with tmp + fsync + rename.
      • Raises ApprovalError(code=…) with one of the codes from the package §10 enum: missing, expired, scope_mismatch, wildcard_forbidden, already_consumed.

tests/test_approval.py — 17 tests, all green:

  • Exempt-base passthrough (3): list lookup, production base rejection, exempt-base consume that does NOT touch YAML.
  • Refusal cases (5): missing id, missing on non-exempt, expired, scope_mismatch on base_key, scope_mismatch on operation, wildcard_forbidden on delete without scope.
  • Happy path (3): consume marks used + persists; second consume raises already_consumed; get() does not mutate; get() returns None for unknown.
  • Concurrency (1): 4 threads race the same approval id; exactly one wins, the other three raise already_consumed.
  • YAML hygiene (3): no .tmp leftover, sidecar lockfile separate from YAML, missing YAML → empty default.
  • Smoke (2): missing-yaml default, lockfile-separate.

4.2 B2 cold pytest result (file-scoped)

17 passed in 0.57s

5. Subset boundary B3 — SafetyLayer skeleton + tests

5.1 Scope delivered

lark_client/safety.py adds:

  • WriteContext dataclass with __post_init__ validation:
    • Rejects source not in WRITE_CONTEXT_SOURCES enum with SafetyViolation(reason="agent_required").
    • Auto-fills idempotency_key (UUID v4) and operation_id if not set.
    • Auto-computes target_count from targets.
  • WriteOutcome dataclass — full evidence trail: status, request_id, lark_response_meta, duration_ms, pii, error, idempotency_key, audit_pre_id, approval_id, backup_ref, would_write, safety_checks_passed.
  • SafetyLayer class with injectable dependencies (approval_provider, audit, gpg_backup, pii_scanner, lock_acquirer, api_caller). Default backends ship as stubs so the orchestration is unit-testable without network / real GPG / real PII scanner.
  • SafetyLayer.guard(ctx) runs the layers in PATCH2 §P2-3 order:
    • Layer 0 (preflight): agent + op enum check.
    • Layer 1 (dry-run / confirm gate): raises SafetyViolation(reason="confirm_required") if neither flag set.
    • Layer 2 (approval): approval.check_and_consume(ctx) — exempt-base path returns synthetic.
    • Layer 3 (backup): update/delete only; calls gpg_backup(ctx, payload)backup_ref. Create/batch_create skip.
    • Layer 4 (audit planned): audit.log_write_planned(...). If raises AuditWriteError, attempt orphan-backup log first, then re-raise.
    • Layer 5 (lock): lock_acquirer(ctx) context manager.
    • Layer 7 (PII scan): pii_scanner(ctx, payload).
    • Dry-run short-circuit: if ctx.dry_run, emit WriteOutcome(status="dry_run", would_write=…) and call log_write_result for trail completeness; API never called.
    • Layer 8 (API call): api_caller(ctx). LarkAPIError/PartialFailureError → log_write_result(failed) → re-raise. Success → log_write_result; if that fails, log_write_emergency + outcome.error = "audit_post_degraded".
    • Layer 10 (cleanup): lock release via context-manager __exit__.

tests/test_safety.py — 13 tests, all green. Key cases:

  • The addendum-§4 critical test (test_exempt_base_bypasses_approval_but_runs_all_other_layers): asserts via mock-call counters that approval_id == "EXEMPT" AND backup ran AND PII scanned AND lock acquired AND API called AND audit planned/success both logged. safety_checks_passed trail shows: ["preflight", "dry_run_or_confirm", "approval", "backup", "audit_planned", "lock", "pii_scan", "api_call"].
  • Preflight errors (agent_required, unknown op).
  • Confirm-required gate.
  • WriteContext.__post_init__ rejects invalid source.
  • Exempt-base dry-run preview shape (no API call asserted via pytest.fail in fake api_caller).
  • Non-exempt base consumes real approval and persists.
  • Non-exempt without approval → ApprovalError(missing).
  • Audit-pre failure → orphan-backup log captures reason="audit_pre_failed".
  • Audit-post failure → emergency file written, outcome marked audit_post_degraded.
  • LarkAPIError propagates while still emitting phase=failed audit entry.
  • Source field propagates into all audit entries (source="mcp" end-to-end).
  • Create op skips GPG backup (only update/delete trigger it).

5.2 B3 cold pytest result (file-scoped)

13 passed in 0.54s

Full suite after B3: 74 passed, 5 skipped, 0 failed.


6. Subset boundary B4 — LarkWriteService + CLI records surface

6.1 Scope delivered

lark_client/service.py

  • LarkWriteService façade holding (core, reader, safety_layer, registry).
  • resolve_base(base_key) returns dict from registry or raises UnknownBaseError.
  • get_record(base_key, table_id, record_id)fully implemented read path. Resolves base via registry, builds endpoint, calls core.read("GET", …, _audit_cmd="records.get"). No SafetyLayer — reads pass through existing whitelist + audit.
  • create / update / delete / batch_create / batch_update / batch_delete — each accepts a WriteContext, sets ctx.op to the operation, sets ctx.is_buffer_base from registry role, and dispatches to SafetyLayer.guard(ctx).
  • NotAuthorizedInRoundB(SafetyViolation) — sentinel raised by the Round B default api_caller. Subclasses SafetyViolation so error mapping stays consistent. Message clearly states "real write execution not authorized in Round B" per addendum §1.

cli/records.py — 7 subcommands

  • records get <base-key> <table-id> <record-id> — fully implemented per addendum §3. Emits structured success JSON per addendum §1.
  • records create / update / delete / batch_create / batch_update / batch_delete — stubs that enter SafetyLayer:
    • Without --dry-run AND without --confirm: SafetyLayer raises SafetyViolation(reason="confirm_required") → addendum-§2 error JSON, exit 4.
    • With --dry-run: SafetyLayer returns WriteOutcome(status="dry_run", would_write=…) → addendum-§1 dry-run JSON, exit 0.
    • With --confirm: SafetyLayer reaches api_caller which raises NotAuthorizedInRoundB → addendum-§2 error JSON with error_type="NotAuthorizedInRoundB", exit 4.

Exception mapping in the CLI:

Exception exit_code error_type reason retry_after_seconds
UnknownBaseError 1 UnknownBaseError
ApprovalError 1 ApprovalError .code enum value
SafetyViolation 4 class name .reason enum value
NotAuthorizedInRoundB 4 NotAuthorizedInRoundB confirm_required
LarkAPIError 2 LarkAPIError 5 if 429, else None
PartialFailureError 2 PartialFailureError
EndpointNotAllowed 4 EndpointNotAllowed
RateLimitExceeded 2 RateLimitExceeded 1

cli/lark_tool.py

  • Imports cli.records and calls records_cli.register(subparsers, _add_common) after the existing subparsers.
  • Dispatches args.command == "records"records_cli.dispatch(args).

config/allowed_endpoints.yaml

  • New read: entry — GET /open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/{record_id} — single-record fetch (addendum §3 explicit allowance). Schema preserved per OQ-9: {method, path, description}.

tests/test_cli_records.py

8 in-process tests via cli.lark_tool.main(argv=…) — no subprocess, no live network:

  • records get success structured JSON shape.
  • records get unknown base → structured UnknownBaseError.
  • records get LarkAPIError → structured error with status_code in details.
  • records create without --dry-run/--confirm → structured SafetyViolation with reason="confirm_required", exit 4.
  • records create --dry-run → structured preview JSON (addendum §1 shape: ok, mode, operation, would_write, approval_status, safety_checks_passed, operation_id, audit_ref).
  • records update --confirm → structured NotAuthorizedInRoundB error.
  • records delete --dry-run on non-exempt base → ApprovalError(missing) (approval check happens before dry-run short-circuits).
  • bases.yaml protection acknowledgement.

6.2 B4 cold pytest result (file-scoped)

8 passed in 0.47s

Full suite after B4: 82 passed, 5 skipped, 0 failed.


7. Sample structured output

7.1 records get — success

{
  "ok": true,
  "mode": "live",
  "operation": "record.get",
  "base_key": "88-phai-cu-base-dem",
  "table_id": "tblPQ6N79EeOmnTm",
  "record_id": "recABCDEFGHIJKLMN",
  "data": { "code": 0, "data": { "record": { "fields": { "x": 1 } } } },
  "warnings": []
}

7.2 records create --dry-run — preview (matches addendum §1)

{
  "ok": true,
  "mode": "dry_run",
  "operation": "record.create",
  "base_key": "88-phai-cu-base-dem",
  "table_id": "tblPQ6N79EeOmnTm",
  "record_id": null,
  "would_write": { "fields": { "name": "x" } },
  "diff": [],
  "approval_status": "exempt",
  "safety_checks_passed": [
    "preflight", "dry_run_or_confirm", "approval",
    "audit_planned", "lock", "pii_scan"
  ],
  "warnings": [],
  "operation_id": "<uuid-v4>",
  "audit_ref": "<uuid-v4>"
}

7.3 records update --confirm — refusal (matches addendum §2)

{
  "ok": false,
  "error": true,
  "exit_code": 4,
  "error_type": "NotAuthorizedInRoundB",
  "message": "confirm_required: real write execution not authorized in Round B for op='record.update'; use --dry-run for a structured preview",
  "reason": "confirm_required",
  "retry_after_seconds": null,
  "operation_id": "<uuid-v4>",
  "audit_ref": "<uuid-v4>",
  "details": {}
}

7.4 records create (no flag) — confirm-required (matches addendum §2)

{
  "ok": false,
  "error": true,
  "exit_code": 4,
  "error_type": "SafetyViolation",
  "message": "confirm_required: mutation requires either --dry-run preview or --confirm",
  "reason": "confirm_required",
  "retry_after_seconds": null,
  "operation_id": null,
  "audit_ref": null,
  "details": {}
}

7.5 records get — LarkAPIError with retry hint (addendum §2)

{
  "ok": false,
  "error": true,
  "exit_code": 2,
  "error_type": "LarkAPIError",
  "message": "Lark API error 99991663: Too Many Requests (HTTP 429)",
  "reason": null,
  "retry_after_seconds": 5,
  "operation_id": null,
  "audit_ref": null,
  "details": { "status_code": 429, "code": 99991663, "msg": "Too Many Requests" }
}

8. Sample audit entry showing source field (addendum §5)

End-to-end run of LarkWriteService.create(ctx) with ctx.source="mcp" produced the following audit entry (timestamps + pid stripped for readability; UUIDs truncated):

{
  "phase": "success",
  "audit_pre_id": "d77227d1…",
  "operation_id": "533c99b3…",
  "idempotency_key": "533c99b3…",
  "agent": "sample-evidence",
  "source": "mcp",
  "cmd": "",
  "op": "record.create",
  "base_key": "88-phai-cu-base-dem",
  "table_id": "tblPQ6N79EeOmnTm",
  "request_id": "",
  "lark_response_meta": {},
  "duration_ms": 0,
  "pii": {
    "pii_redacted": false,
    "redaction_types": [],
    "redacted_fields_count": 0,
    "detector": ["stub"]
  },
  "outcome_status": "dry_run",
  "error": null
}

The source field appears in the entry, is not masked (the _MASK_SKIP_KEYS carve-out is honored — see B1 test test_source_value_is_not_masked), and unknown values fall back to "cli" (B1 test test_audit_entries_reject_invalid_source_value).


9. Approval-exempt-base test transcript (addendum §4)

tests/test_safety.py::test_exempt_base_bypasses_approval_but_runs_all_other_layers proves the addendum §4 invariant. Mock-call counters in the test confirm:

Layer Was it invoked?
preflight ✅ recorded in safety_checks_passed
dry_run / confirm gate confirmed=True path
approval check_and_consume ✅ returned synthetic Approval(id="EXEMPT")
gpg_backup len(backup_calls) == 1
audit log_write_planned ✅ planned-phase entry in audit log
lock_acquirer len(lock_calls) == 1
pii_scanner len(pii_calls) == 1
api_caller len(api_calls) == 1
audit log_write_result ✅ success-phase entry in audit log

outcome.approval_id == "EXEMPT" and outcome.status == "success". The test passes.

The orthogonal test test_exempt_base_dry_run_returns_structured_preview additionally asserts that the dry-run preview path on an exempt base STILL runs preflight + approval check (synthetic) + audit_planned + lock + PII scan — only the API call is skipped. The pytest.fail callback wired into api_caller proves the API was never reached.


10. config/bases.yaml integrity (addendum §6)

When SHA256
Before Round B (after Round A) c063dc00c6b71a56d06435baba3019fcdd33d92999c770f8bc6d784866bdae5d
After all four Round B commits c063dc00c6b71a56d06435baba3019fcdd33d92999c770f8bc6d784866bdae5d

Unchanged. Every Round B test either uses MagicMock(Registry) or writes a synthetic bases.yaml into tmp_path. The production registry was not touched, not normalized, not edited.


11. git diff --stat (Round B only)

 lark-client/cli/lark_tool.py              |  20 +-
 lark-client/cli/records.py                | 443 +++++++++++++++++++++++++++++
 lark-client/config/allowed_endpoints.yaml |   3 +
 lark-client/lark_client/approval.py       | 308 +++++++++++++++++++++
 lark-client/lark_client/audit.py          |  21 ++
 lark-client/lark_client/core.py           |  27 ++
 lark-client/lark_client/safety.py         | 387 +++++++++++++++++++++++++
 lark-client/lark_client/service.py        | 137 +++++++++
 lark-client/scripts/s179_probe.py         |   8 +-
 lark-client/tests/test_approval.py        | 283 ++++++++++++++++++
 lark-client/tests/test_cli_records.py     | 326 ++++++++++++++++++++++
 lark-client/tests/test_cli_smoke.py       |  13 +-
 lark-client/tests/test_core_write.py      |  10 +-
 lark-client/tests/test_round_b1.py        | 173 ++++++++++++
 lark-client/tests/test_safety.py          | 383 ++++++++++++++++++++++++++
 15 files changed, 2530 insertions(+), 12 deletions(-)

Commits stacked on the same feature branch: 60e3d80 (B1) → 8527154 (B2) → 4b73375 (B3) → 1cf7901 (B4). No force-push, no rebase, no merge.


12. Test totals

Subset New tests Cold-pytest delta
Round A baseline 31 pass / 5 skip / 2 fail
B1 +12 (test_round_b1) 44 pass / 5 skip / 0 fail (closed pre-existing CLI smoke)
B2 +17 (test_approval) 61 pass / 5 skip / 0 fail
B3 +13 (test_safety) 74 pass / 5 skip / 0 fail
B4 +8 (test_cli_records) 82 pass / 5 skip / 0 fail

Full-suite reproducibility:

ssh contabo 'cd /opt/incomex/lark-client && env -u LARK_TEST_INTEGRATION .venv/bin/pytest -q'

13. Confirmations (addendum §8 inheritance + base-package §22)

  • No live Lark API call — all tests use mocks; dry-run / refuse paths block any path to LarkCore._request.
  • No production touched — production base 88-phai-cu (YSIkb8PxOaNaozs2vwalOOcagkf) never contacted; Base đệm only mentioned in test fixtures.
  • No deploy — no service restart, no Docker action, no nginx reload.
  • No push / merge / tagfeat/s177-sprint1-round-a remains local.
  • No MCP write enablementcli/records.py _round_b_refusing_api_caller ensures every mutating CLI invocation raises before reaching Lark, even with --confirm. The MCP adapter is Sprint 2 scope.
  • No new bot, no credential rotation, no secret printedgcloud secrets list --filter='name~LARK' was re-run only to re-confirm LARK_BACKUP_GPG_PUBKEY is still absent (status unchanged from Round A); no secret was read or accessed.
  • No KB docs created outside the s177 folder.
  • No self-advance to Sprint 2.
  • config/bases.yaml SHA256 unchanged — see §10.

14. Outstanding items / recommendation for follow-on rounds

The four subsets shipped, but the following are still gated and should land before any Sprint 2 (MCP adapter) authorization:

  1. GSM provisioning of LARK_BACKUP_GPG_PUBKEY. Still absent. SafetyLayer ships with a stub _default_gpg_backup that returns a synthetic backup_ref; the real GPG backend cannot ship until the public key lands in GSM (github-chatgpt-ggcloud project).
  2. OQ-8 documentation citation. lark-api-limits.yaml :: write_endpoint_options still carries conservative defaults. A doc citation pass should flip client_token: falsetrue for endpoints where Lark official docs confirm support.
  3. Real PII scanner. Round B ships a stub _default_pii_scanner returning empty metadata. The real scanner (S176 snapshot replay + regex registry) should land before any real write, even on Base đệm.
  4. Real lock acquirer. Round B uses _no_lock. The real per-record lock (file-based or Redis) is necessary before concurrent CLI invocations against a single record can be safe.
  5. Sprint 2 authorization. MCP adapter, once the above are in.
  6. Optional polish. The LarkWriteService.create etc. could be refactored to use named arguments rather than mutating ctx.op — but the current shape works and is testable, so left as-is.

15. STOP

Round B execution complete. Round B is now the new baseline for any further Sprint 1 work or for Sprint 2 authorization decision. No further self-advance.

Back to Knowledge Hub knowledge/dev/lark/s177-controlled-crud-gateway/s177-sprint1-round-b-implementation-evidence-2026-05-20.md