KB-A8E6

dot-iu-cutter v0.1 Gate 3 Review — Semantic Thread + Retrieval

8 min read Revision 1

dot-iu-cutterreviewgate-3semantic-threadingthread-retrievalsegmentation-healthrev5ddesign-review

dot-iu-cutter v0.1 — Gate 3 Review: Semantic Thread + Retrieval

Date: 2026-05-15
Reviewer: GPT
Review Gate: Gate 3 — Semantic Thread + Retrieval
Files reviewed: D9 Cross-Temporal Semantic Threading, D11 Thread Retrieval and User Interaction, D3 Segmentation Health and Usage Feedback
Scope: Review only. No implementation, no migration, no PG mutation.

1. Verdict

gate_3_status: PASS_WITH_NOTES
semantic_thread_blocker_found: false
retrieval_access_control_blocker_found: false
health_feedback_blocker_found: false
parallel_graph_or_taxonomy_found: false
ready_for_gate_4: true
ready_for_implementation: false

Gate 3 passes for design review purposes. The Semantic Thread model preserves the key constraints: Thread ≠ Edge, universal_edges-first, user-directed intent is not automatic graph truth, auto-accept is risk-gated, retrieval is audience-scoped, and search/health feedback loops are present.

Several policy and schema closures remain mandatory before implementation planning.

2. What Was Checked

2.1 Semantic Thread model

Checked whether thread is modeled as a living professional-domain axis rather than a simple edge.

Result: PASS.

2.2 universal_edges-first

Checked whether the design creates a parallel graph.

Result: PASS_WITH_SCHEMA_GAP.

2.3 User-directed threads

Checked whether user intent is first-class without becoming automatic graph truth.

Result: PASS.

2.4 Auto-accept risk control

Checked whether thread membership auto-accept is safely gated.

Result: PASS_WITH_POLICY_GAPS.

2.5 Thread-first retrieval and fallback

Checked whether thread-first retrieval is the default for AI/Agent, with vector fallback only when thread confidence is insufficient.

Result: PASS.

2.6 Audience-scoped search and access control

Checked whether retrieval treats audience scope as access control, not merely a filter.

Result: PASS_WITH_IMPLEMENTATION_CLOSURES.

2.7 Health feedback loop

Checked whether missing/wrong/stale/noisy signals feed into health report and decision backlog.

Result: PASS.

3. Findings

3.1 Thread ≠ Edge is preserved

D9 defines Semantic Thread as a higher-order construct containing memberships, evidence, signals, expected chains, and governance lifecycle. It explicitly states that a thread is not an edge.

This satisfies the core modeling requirement.

3.2 universal_edges-first is correctly enforced

D9 repeatedly requires evaluating universal_edges before introducing separate membership tables. This avoids a parallel graph authority.

Remaining closure: implementation planning must audit the actual universal_edges schema to decide whether it can represent thread membership with status, confidence, lifecycle, and evidence payload. If not, the gap must be routed through Điều 39 / Điều 44 / Điều 33–43 governance.

3.3 User-directed thread is handled correctly

D9 treats user-directed threads as first-class input but not automatic graph truth. User-added membership has provenance such as accepted_by_user, and AI may flag inconsistency without silently overriding the user.

This is the right balance between human intent and graph correctness.

3.4 Auto-accept is safe at the design level

D9 allows auto-accept only if all conditions hold:

low risk;
at least two independent evidence signals;
no negative knowledge conflict;
policy explicitly allows.

This is acceptable for design review.

Remaining closure: thresholds and eligible domains must be decided by policy before implementation. High-risk/legal/governance/code-impact links must never auto-accept.

3.5 Negative knowledge is present and useful

D9 persists rejected links so that bad suggestions are not repeated. Re-proposal requires materially different evidence.

Remaining closure: decide negative-knowledge decay policy and what counts as “materially different” evidence.

3.6 Expected lifecycle chain is present

D9 defines optional expected-chain JSONB to detect missing artifacts such as law → design → code → test → report.

This is essential to cross-temporal semantic threading.

Remaining closure: define default expected chains for first production domains later, not in this design review.

3.7 Retrieval design respects access control

D11 treats audience-scoped search as access control, not quality filtering. wrong_audience_result is explicitly security/governance issue.

This satisfies the critical access-control guardrail.

Remaining closure: implementation planning must define how auth context, role, task scope, tool permission, data classification, visibility, readiness, and publication state are actually propagated into retrieval.

3.8 Thread Context Pack is sufficient as a design primitive

D11 defines a useful pack structure with canonical units, supporting units, latest reports, related design/code links, known gaps, suspect links, health status, authority/risk, reading order, source revisions, and audience scope.

This is strong enough for AI/Agent consumption.

3.9 Fallback to vector search is properly bounded

D11 defaults to thread-first retrieval and uses vector fallback only when no reliable thread exists, thread health is weak, or the user explicitly requests raw search.

This matches the target product behavior.

3.10 Health/Correction loop is adequate

D3 consumes segmentation, thread, and retrieval signals and routes them into health reports and decision backlog. It prohibits automatic structural change and requires evidence bundle + review.

This preserves governance and prevents over-correction.

4. Required Closures Before Implementation Planning

These do not block Gate 4 review, but they block implementation planning:

Audit actual universal_edges capability for thread membership: status, confidence, evidence, lifecycle, provenance.
Decide whether semantic_thread_membership is represented by universal_edges or a separate lifecycle table.
Define auto-accept thresholds and eligible low-risk domains.
Define negative knowledge decay / re-candidate policy.
Define initial expected lifecycle chains for priority domains.
Define auth-context propagation to retrieval layer.
Define formal audience classes and data classification mapping.
Define wrong_audience_result handling: block, log, escalate, rollback/response invalidation policy.
Define retrieval metrics instrumentation and event capture.
Define co-edit, co-citation, co-retrieval instrumentation plan.
Decide context pack caching policy: always-fresh vs cached, and invalidation rules.
Define thread centroid embedding scheme and update cadence.

5. Gate 3 Conclusion

gate_3_result: PASS_WITH_NOTES
next_gate: Gate 4 — Implementation Readiness Boundary
implementation_allowed: false

Proceed to Gate 4 review: D8 §6 schema gaps, D8 §8 missing instrumentation, D8 §9 open questions, and review-control priority grouping.