S177 — Sprint 1 Round B Implementation Evidence

Status: PASS — full B1+B2+B3+B4 completed in one session. Date: 2026-05-20 Authored by: Claude Code (Opus 4.7, 1M context) Pairs with:

s177-sprint1-round-b-addendum-2026-05-20.md (binding addendum)
s177-sprint1-round-a-implementation-evidence-2026-05-19.md (Round A baseline)
s177-sprint1-command-review-package-2026-05-19.md (base spec)
s177-architecture-design-2026-05-19-patch2.md
s177-sprint1-implementation-checklist-2026-05-19.md
s177-oq-decision-record-2026-05-19.md

1. Top-line result

PASS for all four addendum subset boundaries: B1, B2, B3, B4.
15 files changed Round-B-only, +2,530 / −12 lines on the feature branch.
4 commits stacked on top of Round A: 60e3d80 (B1) → 8527154 (B2) → 4b73375 (B3) → 1cf7901 (B4).
Cold pytest: 82 passed, 5 skipped, 0 failed. All 5 skipped are the live-API tests, still gated by LARK_TEST_INTEGRATION=1. Zero failures — including the 2 pre-existing CLI smoke --json failures that survived Round A (root-caused and fixed in B1: argparse subparser dest collision).
No live Lark API call during cold pytest.
No production data touched. No deploy. No push/merge/tag. No MCP write enablement. No new bot. No credential rotation. No secret printed.
config/bases.yaml SHA256 unchanged — c063dc00c6b71a56d06435baba3019fcdd33d92999c770f8bc6d784866bdae5d before and after Round B (addendum §6).

2. Repo / branch / HEAD

Item	Value
Branch	`feat/s177-sprint1-round-a` (continued from Round A; NOT pushed)
Round A base	`06b11c0`
Round B B1 commit	`60e3d80` — baseline fixes + request-leak cleanup
Round B B2 commit	`8527154` — ApprovalProvider + tests
Round B B3 commit	`4b73375` — SafetyLayer skeleton + tests
Round B B4 commit	`1cf7901` — LarkWriteService + CLI records surface
Remote ops	none — no push, no merge, no tag

3. Subset boundary B1 — baseline fixes + request-leak cleanup

3.1 Scope delivered

LarkCore.read(...) public read entrypoint added to core.py (mirrors LarkCore.write minus client_token injection; one additive method).
scripts/s179_probe.py refactored to call core.read(...) instead of core._request(...). Behavior unchanged — only the call surface moved to the sanctioned public path. Docstring reworded to avoid lint-regex false-positive.
Dropped scripts/ exclusion in test_no_request_or_requests_outside_core. The lint now applies to scripts/ too.
cli/lark_tool.py argparse fix. The pre-existing CLI smoke failures (test_registry_list_json, test_registry_show_json) were not caused by binary path precedence — they were a subparser dest collision. The shared-args helper redefined --json / --agent on each subparser, so argparse overwrote the top-level value with the subparser default (False/None) when the flag was given pre-subcommand (e.g. lark-tool --json registry list). Fix: pass default=argparse.SUPPRESS on the subparser-level definitions; both pre- and post-subcommand positions now work.
tests/test_cli_smoke.py path-pinned to .venv/bin/lark-tool (defense-in-depth so the stale system-wide /usr/local/bin/lark-tool cannot influence test outcomes).
WriteContext.source plumbing in lark_client/audit.py.
- _MASK_SKIP_KEYS extended from 19 → 20 keys (added "source").
- New WRITE_CONTEXT_SOURCES = ("cli", "mcp", "cron", "api") enum exposed.
- New _coerce_source(ctx) helper — unknown values fall back to "cli" per addendum §5.
- All four entry builders (_build_planned_entry, _build_result_entry, _build_emergency_entry, _build_orphan_entry) include source.
tests/test_round_b1.py — 12 new tests covering all of the above.

3.2 B1 cold pytest result

44 passed, 5 skipped in 2.77s

(Previously: 31 passed, 5 skipped, 2 failed — i.e., B1 closed the 2 pre-existing failures and added 12 new green tests.)

4. Subset boundary B2 — ApprovalProvider + tests

4.1 Scope delivered

lark_client/approval.py adds:

Approval dataclass — full row shape from write-approvals.yaml :: approvals[] with exempt_synthetic flag distinguishing real vs. exempt-bypass approvals.
ApprovalProvider(yaml_path):
- is_base_exempt(base_key) — O(n) lookup against approval_exempt_bases.
- get(approval_id) — read-only lookup; no mutation.
- check_and_consume(approval_id, ctx, idempotency_key):
  - Exempt-base bypass: if ctx.base_key is in approval_exempt_bases, returns a synthetic Approval(id="EXEMPT", exempt_synthetic=True) without touching the YAML.
  - Validation: operation match, scope match (base_key/table_id/record_id), wildcard-forbidden on update/delete without explicit base_key, expiry check.
  - One-time-use atomic consume: takes an exclusive fcntl.flock on a sidecar .lock file (the YAML itself gets atomically swapped via os.replace, so the sidecar survives the swap); rewrites YAML with tmp + fsync + rename.
  - Raises ApprovalError(code=…) with one of the codes from the package §10 enum: missing, expired, scope_mismatch, wildcard_forbidden, already_consumed.

tests/test_approval.py — 17 tests, all green:

Exempt-base passthrough (3): list lookup, production base rejection, exempt-base consume that does NOT touch YAML.
Refusal cases (5): missing id, missing on non-exempt, expired, scope_mismatch on base_key, scope_mismatch on operation, wildcard_forbidden on delete without scope.
Happy path (3): consume marks used + persists; second consume raises already_consumed; get() does not mutate; get() returns None for unknown.
Concurrency (1): 4 threads race the same approval id; exactly one wins, the other three raise already_consumed.
YAML hygiene (3): no .tmp leftover, sidecar lockfile separate from YAML, missing YAML → empty default.
Smoke (2): missing-yaml default, lockfile-separate.

4.2 B2 cold pytest result (file-scoped)

17 passed in 0.57s

5. Subset boundary B3 — SafetyLayer skeleton + tests

5.1 Scope delivered

lark_client/safety.py adds:

WriteContext dataclass with __post_init__ validation:
- Rejects source not in WRITE_CONTEXT_SOURCES enum with SafetyViolation(reason="agent_required").
- Auto-fills idempotency_key (UUID v4) and operation_id if not set.
- Auto-computes target_count from targets.
WriteOutcome dataclass — full evidence trail: status, request_id, lark_response_meta, duration_ms, pii, error, idempotency_key, audit_pre_id, approval_id, backup_ref, would_write, safety_checks_passed.
SafetyLayer class with injectable dependencies (approval_provider, audit, gpg_backup, pii_scanner, lock_acquirer, api_caller). Default backends ship as stubs so the orchestration is unit-testable without network / real GPG / real PII scanner.
SafetyLayer.guard(ctx) runs the layers in PATCH2 §P2-3 order:
- Layer 0 (preflight): agent + op enum check.
- Layer 1 (dry-run / confirm gate): raises SafetyViolation(reason="confirm_required") if neither flag set.
- Layer 2 (approval): approval.check_and_consume(ctx) — exempt-base path returns synthetic.
- Layer 3 (backup): update/delete only; calls gpg_backup(ctx, payload) → backup_ref. Create/batch_create skip.
- Layer 4 (audit planned): audit.log_write_planned(...). If raises AuditWriteError, attempt orphan-backup log first, then re-raise.
- Layer 5 (lock): lock_acquirer(ctx) context manager.
- Layer 7 (PII scan): pii_scanner(ctx, payload).
- Dry-run short-circuit: if ctx.dry_run, emit WriteOutcome(status="dry_run", would_write=…) and call log_write_result for trail completeness; API never called.
- Layer 8 (API call): api_caller(ctx). LarkAPIError/PartialFailureError → log_write_result(failed) → re-raise. Success → log_write_result; if that fails, log_write_emergency + outcome.error = "audit_post_degraded".
- Layer 10 (cleanup): lock release via context-manager __exit__.

tests/test_safety.py — 13 tests, all green. Key cases:

The addendum-§4 critical test (test_exempt_base_bypasses_approval_but_runs_all_other_layers): asserts via mock-call counters that approval_id == "EXEMPT" AND backup ran AND PII scanned AND lock acquired AND API called AND audit planned/success both logged. safety_checks_passed trail shows: ["preflight", "dry_run_or_confirm", "approval", "backup", "audit_planned", "lock", "pii_scan", "api_call"].
Preflight errors (agent_required, unknown op).
Confirm-required gate.
WriteContext.__post_init__ rejects invalid source.
Exempt-base dry-run preview shape (no API call asserted via pytest.fail in fake api_caller).
Non-exempt base consumes real approval and persists.
Non-exempt without approval → ApprovalError(missing).
Audit-pre failure → orphan-backup log captures reason="audit_pre_failed".
Audit-post failure → emergency file written, outcome marked audit_post_degraded.
LarkAPIError propagates while still emitting phase=failed audit entry.
Source field propagates into all audit entries (source="mcp" end-to-end).
Create op skips GPG backup (only update/delete trigger it).

5.2 B3 cold pytest result (file-scoped)

13 passed in 0.54s

Full suite after B3: 74 passed, 5 skipped, 0 failed.

6. Subset boundary B4 — LarkWriteService + CLI records surface

6.1 Scope delivered

`lark_client/service.py`

LarkWriteService façade holding (core, reader, safety_layer, registry).
resolve_base(base_key) returns dict from registry or raises UnknownBaseError.
get_record(base_key, table_id, record_id) — fully implemented read path. Resolves base via registry, builds endpoint, calls core.read("GET", …, _audit_cmd="records.get"). No SafetyLayer — reads pass through existing whitelist + audit.
create / update / delete / batch_create / batch_update / batch_delete — each accepts a WriteContext, sets ctx.op to the operation, sets ctx.is_buffer_base from registry role, and dispatches to SafetyLayer.guard(ctx).
NotAuthorizedInRoundB(SafetyViolation) — sentinel raised by the Round B default api_caller. Subclasses SafetyViolation so error mapping stays consistent. Message clearly states "real write execution not authorized in Round B" per addendum §1.

`cli/records.py` — 7 subcommands

records get <base-key> <table-id> <record-id> — fully implemented per addendum §3. Emits structured success JSON per addendum §1.
records create / update / delete / batch_create / batch_update / batch_delete — stubs that enter SafetyLayer:
- Without --dry-run AND without --confirm: SafetyLayer raises SafetyViolation(reason="confirm_required") → addendum-§2 error JSON, exit 4.
- With --dry-run: SafetyLayer returns WriteOutcome(status="dry_run", would_write=…) → addendum-§1 dry-run JSON, exit 0.
- With --confirm: SafetyLayer reaches api_caller which raises NotAuthorizedInRoundB → addendum-§2 error JSON with error_type="NotAuthorizedInRoundB", exit 4.

Exception mapping in the CLI:

Exception	exit_code	error_type	reason	retry_after_seconds
`UnknownBaseError`	1	`UnknownBaseError`	—	—
`ApprovalError`	1	`ApprovalError`	`.code` enum value	—
`SafetyViolation`	4	class name	`.reason` enum value	—
`NotAuthorizedInRoundB`	4	`NotAuthorizedInRoundB`	`confirm_required`	—
`LarkAPIError`	2	`LarkAPIError`	—	5 if 429, else None
`PartialFailureError`	2	`PartialFailureError`	—	—
`EndpointNotAllowed`	4	`EndpointNotAllowed`	—	—
`RateLimitExceeded`	2	`RateLimitExceeded`	—	1

`cli/lark_tool.py`

Imports cli.records and calls records_cli.register(subparsers, _add_common) after the existing subparsers.
Dispatches args.command == "records" → records_cli.dispatch(args).

`config/allowed_endpoints.yaml`

New read: entry — GET /open-apis/bitable/v1/apps/{app_token}/tables/{table_id}/records/{record_id} — single-record fetch (addendum §3 explicit allowance). Schema preserved per OQ-9: {method, path, description}.

`tests/test_cli_records.py`

8 in-process tests via cli.lark_tool.main(argv=…) — no subprocess, no live network:

records get success structured JSON shape.
records get unknown base → structured UnknownBaseError.
records get LarkAPIError → structured error with status_code in details.
records create without --dry-run/--confirm → structured SafetyViolation with reason="confirm_required", exit 4.
records create --dry-run → structured preview JSON (addendum §1 shape: ok, mode, operation, would_write, approval_status, safety_checks_passed, operation_id, audit_ref).
records update --confirm → structured NotAuthorizedInRoundB error.
records delete --dry-run on non-exempt base → ApprovalError(missing) (approval check happens before dry-run short-circuits).
bases.yaml protection acknowledgement.

6.2 B4 cold pytest result (file-scoped)

8 passed in 0.47s

Full suite after B4: 82 passed, 5 skipped, 0 failed.

7. Sample structured output

7.1 `records get` — success

{
  "ok": true,
  "mode": "live",
  "operation": "record.get",
  "base_key": "88-phai-cu-base-dem",
  "table_id": "tblPQ6N79EeOmnTm",
  "record_id": "recABCDEFGHIJKLMN",
  "data": { "code": 0, "data": { "record": { "fields": { "x": 1 } } } },
  "warnings": []
}

7.2 `records create --dry-run` — preview (matches addendum §1)

{
  "ok": true,
  "mode": "dry_run",
  "operation": "record.create",
  "base_key": "88-phai-cu-base-dem",
  "table_id": "tblPQ6N79EeOmnTm",
  "record_id": null,
  "would_write": { "fields": { "name": "x" } },
  "diff": [],
  "approval_status": "exempt",
  "safety_checks_passed": [
    "preflight", "dry_run_or_confirm", "approval",
    "audit_planned", "lock", "pii_scan"
  ],
  "warnings": [],
  "operation_id": "<uuid-v4>",
  "audit_ref": "<uuid-v4>"
}

7.3 `records update --confirm` — refusal (matches addendum §2)

{
  "ok": false,
  "error": true,
  "exit_code": 4,
  "error_type": "NotAuthorizedInRoundB",
  "message": "confirm_required: real write execution not authorized in Round B for op='record.update'; use --dry-run for a structured preview",
  "reason": "confirm_required",
  "retry_after_seconds": null,
  "operation_id": "<uuid-v4>",
  "audit_ref": "<uuid-v4>",
  "details": {}
}

7.4 `records create` (no flag) — confirm-required (matches addendum §2)

{
  "ok": false,
  "error": true,
  "exit_code": 4,
  "error_type": "SafetyViolation",
  "message": "confirm_required: mutation requires either --dry-run preview or --confirm",
  "reason": "confirm_required",
  "retry_after_seconds": null,
  "operation_id": null,
  "audit_ref": null,
  "details": {}
}

7.5 `records get` — LarkAPIError with retry hint (addendum §2)

{
  "ok": false,
  "error": true,
  "exit_code": 2,
  "error_type": "LarkAPIError",
  "message": "Lark API error 99991663: Too Many Requests (HTTP 429)",
  "reason": null,
  "retry_after_seconds": 5,
  "operation_id": null,
  "audit_ref": null,
  "details": { "status_code": 429, "code": 99991663, "msg": "Too Many Requests" }
}

8. Sample audit entry showing `source` field (addendum §5)

End-to-end run of LarkWriteService.create(ctx) with ctx.source="mcp" produced the following audit entry (timestamps + pid stripped for readability; UUIDs truncated):

{
  "phase": "success",
  "audit_pre_id": "d77227d1…",
  "operation_id": "533c99b3…",
  "idempotency_key": "533c99b3…",
  "agent": "sample-evidence",
  "source": "mcp",
  "cmd": "",
  "op": "record.create",
  "base_key": "88-phai-cu-base-dem",
  "table_id": "tblPQ6N79EeOmnTm",
  "request_id": "",
  "lark_response_meta": {},
  "duration_ms": 0,
  "pii": {
    "pii_redacted": false,
    "redaction_types": [],
    "redacted_fields_count": 0,
    "detector": ["stub"]
  },
  "outcome_status": "dry_run",
  "error": null
}

The source field appears in the entry, is not masked (the _MASK_SKIP_KEYS carve-out is honored — see B1 test test_source_value_is_not_masked), and unknown values fall back to "cli" (B1 test test_audit_entries_reject_invalid_source_value).

9. Approval-exempt-base test transcript (addendum §4)

tests/test_safety.py::test_exempt_base_bypasses_approval_but_runs_all_other_layers proves the addendum §4 invariant. Mock-call counters in the test confirm:

Layer	Was it invoked?
preflight	✅ recorded in `safety_checks_passed`
dry_run / confirm gate	✅ `confirmed=True` path
approval `check_and_consume`	✅ returned synthetic `Approval(id="EXEMPT")`
gpg_backup	✅ `len(backup_calls) == 1`
audit `log_write_planned`	✅ planned-phase entry in audit log
lock_acquirer	✅ `len(lock_calls) == 1`
pii_scanner	✅ `len(pii_calls) == 1`
api_caller	✅ `len(api_calls) == 1`
audit `log_write_result`	✅ success-phase entry in audit log

outcome.approval_id == "EXEMPT" and outcome.status == "success". The test passes.

The orthogonal test test_exempt_base_dry_run_returns_structured_preview additionally asserts that the dry-run preview path on an exempt base STILL runs preflight + approval check (synthetic) + audit_planned + lock + PII scan — only the API call is skipped. The pytest.fail callback wired into api_caller proves the API was never reached.

10. `config/bases.yaml` integrity (addendum §6)

When	SHA256
Before Round B (after Round A)	`c063dc00c6b71a56d06435baba3019fcdd33d92999c770f8bc6d784866bdae5d`
After all four Round B commits	`c063dc00c6b71a56d06435baba3019fcdd33d92999c770f8bc6d784866bdae5d`

Unchanged. Every Round B test either uses MagicMock(Registry) or writes a synthetic bases.yaml into tmp_path. The production registry was not touched, not normalized, not edited.

11. git diff --stat (Round B only)

 lark-client/cli/lark_tool.py              |  20 +-
 lark-client/cli/records.py                | 443 +++++++++++++++++++++++++++++
 lark-client/config/allowed_endpoints.yaml |   3 +
 lark-client/lark_client/approval.py       | 308 +++++++++++++++++++++
 lark-client/lark_client/audit.py          |  21 ++
 lark-client/lark_client/core.py           |  27 ++
 lark-client/lark_client/safety.py         | 387 +++++++++++++++++++++++++
 lark-client/lark_client/service.py        | 137 +++++++++
 lark-client/scripts/s179_probe.py         |   8 +-
 lark-client/tests/test_approval.py        | 283 ++++++++++++++++++
 lark-client/tests/test_cli_records.py     | 326 ++++++++++++++++++++++
 lark-client/tests/test_cli_smoke.py       |  13 +-
 lark-client/tests/test_core_write.py      |  10 +-
 lark-client/tests/test_round_b1.py        | 173 ++++++++++++
 lark-client/tests/test_safety.py          | 383 ++++++++++++++++++++++++++
 15 files changed, 2530 insertions(+), 12 deletions(-)

Commits stacked on the same feature branch: 60e3d80 (B1) → 8527154 (B2) → 4b73375 (B3) → 1cf7901 (B4). No force-push, no rebase, no merge.

12. Test totals

Subset	New tests	Cold-pytest delta
Round A baseline	—	31 pass / 5 skip / 2 fail
B1	+12 (test_round_b1)	44 pass / 5 skip / 0 fail (closed pre-existing CLI smoke)
B2	+17 (test_approval)	61 pass / 5 skip / 0 fail
B3	+13 (test_safety)	74 pass / 5 skip / 0 fail
B4	+8 (test_cli_records)	82 pass / 5 skip / 0 fail

Full-suite reproducibility:

ssh contabo 'cd /opt/incomex/lark-client && env -u LARK_TEST_INTEGRATION .venv/bin/pytest -q'

13. Confirmations (addendum §8 inheritance + base-package §22)

No live Lark API call — all tests use mocks; dry-run / refuse paths block any path to LarkCore._request.
No production touched — production base 88-phai-cu (YSIkb8PxOaNaozs2vwalOOcagkf) never contacted; Base đệm only mentioned in test fixtures.
No deploy — no service restart, no Docker action, no nginx reload.
No push / merge / tag — feat/s177-sprint1-round-a remains local.
No MCP write enablement — cli/records.py _round_b_refusing_api_caller ensures every mutating CLI invocation raises before reaching Lark, even with --confirm. The MCP adapter is Sprint 2 scope.
No new bot, no credential rotation, no secret printed — gcloud secrets list --filter='name~LARK' was re-run only to re-confirm LARK_BACKUP_GPG_PUBKEY is still absent (status unchanged from Round A); no secret was read or accessed.
No KB docs created outside the s177 folder.
No self-advance to Sprint 2.
config/bases.yaml SHA256 unchanged — see §10.

14. Outstanding items / recommendation for follow-on rounds

The four subsets shipped, but the following are still gated and should land before any Sprint 2 (MCP adapter) authorization:

GSM provisioning of LARK_BACKUP_GPG_PUBKEY. Still absent. SafetyLayer ships with a stub _default_gpg_backup that returns a synthetic backup_ref; the real GPG backend cannot ship until the public key lands in GSM (github-chatgpt-ggcloud project).
OQ-8 documentation citation. lark-api-limits.yaml :: write_endpoint_options still carries conservative defaults. A doc citation pass should flip client_token: false → true for endpoints where Lark official docs confirm support.
Real PII scanner. Round B ships a stub _default_pii_scanner returning empty metadata. The real scanner (S176 snapshot replay + regex registry) should land before any real write, even on Base đệm.
Real lock acquirer. Round B uses _no_lock. The real per-record lock (file-based or Redis) is necessary before concurrent CLI invocations against a single record can be safe.
Sprint 2 authorization. MCP adapter, once the above are in.
Optional polish. The LarkWriteService.create etc. could be refactored to use named arguments rather than mutating ctx.op — but the current shape works and is testable, so left as-is.

15. STOP

Round B execution complete. Round B is now the new baseline for any further Sprint 1 work or for Sprint 2 authorization decision. No further self-advance.