KB-6445

Phase 2F-D3 — Remove move_document from MCP Schema Implementation Report (2026-05-14)

9 min read Revision 1
phase-2fd3implementationmcpmove_documentschema-cleanupagent-dataproduction

Phase 2F-D3 — Remove move_document from MCP schema — Implementation Report

Date: 2026-05-14 Status: PASS — deployed to VPS, committed. Commit: 62653d0 on main of /opt/incomex/docker/agent-data-repo. Design reference: knowledge/current-state/reports/phase-2f-c-rev2-mcp-production-patch-design-2026-05-14.md S4 Prior phases: D1 19315ce, D2a 34ca724 — both regression-intact.


1. Code changes summary

Single-file change: agent_data/server.py (+12 / -21 lines, net -9).

Three edits

# Location Change
1 MCP_TOOLS list (around line 2848) Removed the move_document JSON object that advertised the tool's input schema. Replaced with a comment pointing to the D3 rationale.
2 MCP_GPT_FULL_TOOLS frozenset (around line 2709) Removed the "move_document" entry from the allowlist. Replaced with an inline comment explaining the kill-switch background.
3 _dispatch_mcp_tool (around line 3194) Deleted the if tool_name == "move_document": branch. Any MCP tools/call name=move_document now falls through to either the generic ValueError("Unknown tool: ...") (on /mcp, dispatcher path) or -32601 Tool not allowed (on /mcp-* profile path, allowlist check).
4 CONNECTOR_SCHEMA_VERSION (line 51) Bumped from gpt-agent-data-2026-05-14.1 -> gpt-agent-data-2026-05-14.2 so any client that caches tools/list by version forcibly refreshes. _mcp_schema_hash() auto-updates because it hashes MCP_TOOLS.

What was NOT changed

  • POST /documents/{x}/move HTTP endpoint (line 1625) — left intact, still returns 501 NOT_IMPLEMENTED. Direct REST callers see no change.
  • move_document function definition — left intact; only the dispatcher branch is removed.
  • No other tool affected; D1 and D2a code paths untouched.

2. Tools counts: before / after

Profile Before D3 After D3 move_document advertised?
/mcp (internal) 11 10 no
/mcp-readonly 5 5 (unchanged) no
/mcp-gpt (public via nginx token) 8 8 (unchanged — already excluded) no
/mcp-gpt-full (internal admin only) 11 10 no

Verified via:

  • Internal tools/list POST to each endpoint (V1-V4 in S4).
  • Public HTTPS /gpt-mcp/{token}/mcp returns 8 tools, move_document not present.

3. Forced tools/call move_document behaviour

Endpoint Request Response
POST /mcp tools/call name=move_document wrapped result with content[0].text = {"error": "Unknown tool: move_document"}; server logs status=500 failure_class=ValueError schema_version=... (expected — dispatcher raised, handler caught, structured error returned)
POST /mcp-gpt-full tools/call name=move_document JSON-RPC envelope with error.code: -32601, error.message: "Tool 'move_document' not allowed in gpt-full MCP mode". Allowlist check rejected before dispatch.
POST /mcp-gpt tools/call name=move_document same -32601 rejection (also not in MCP_GPT_TOOLS allowlist; was already true pre-D3).
POST /mcp-readonly tools/call name=move_document same -32601 rejection.

In all cases the tool does not execute. The schema-runtime mismatch documented in Rev2 S3.1 is closed.

The dispatcher-level error log (Unknown tool: move_document) is a deliberate audit signal — any client still attempting the deprecated tool is now visible in container logs. Expected to taper off as cached client schemas refresh on the version bump.

4. Regression tests

13 verifications run inside incomex-agent-data against http://127.0.0.1:8000. All PASS.

# Check Result
V1 /mcp tools/list count=10, no move_document PASS
V2 /mcp-readonly tools/list count=5, no move_document PASS
V3 /mcp-gpt tools/list count=8, no move_document PASS
V4 /mcp-gpt-full tools/list count=10, no move_document PASS
V5 connectorSchemaVersion bumped to gpt-agent-data-2026-05-14.2 PASS
V6 Forced tools/call move_document on /mcp rejected with "Unknown tool" PASS
V7 Forced tools/call move_document on /mcp-gpt-full returns -32601 not allowed PASS
V8 Regression: list_documents path=knowledge/test/ limit=3 returns items PASS
V9 Regression: get_document returns content without isError PASS
V10 Regression: patch_document with non-matching old_str returns {"error": "old_str not found in document content"} (verified against a real path) PASS
V11 D2a JSON Accept on /mcp-gpt tools/list -> Content-Type: application/json; charset=utf-8 PASS
V12 D2a SSE Accept on /mcp-gpt tools/list -> Content-Type: text/event-stream; charset=utf-8, valid event: message frame PASS
V13 connectorSchemaHash present and 12-char (auto-updated by the MCP_TOOLS change) PASS

Public route sanity

https://vps.incomexsaigoncorp.vn/gpt-mcp/{token}/mcp (nginx -> upstream /mcp-gpt):

  • tools/list Accept JSON -> 200 application/json, 8 tools, move_document not in list.

5. Commit

  • Hash: 62653d0
  • Branch: main
  • Message: phase2f-d3: remove deprecated move_document from MCP schema
  • Diff: 1 file changed, 12 insertions(+), 21 deletions(-)

Co-author trailer: Claude Opus 4.7 (1M context).

6. Logs / resource snapshot

  • No new error classes since restart. The expected ERROR ... Unknown tool: move_document line fires exactly when the V6 test deliberately invokes the deprecated tool — this is the intended audit signal. Real clients should not hit it after schema cache refresh.
  • A Traceback line appeared during the rebuild's uvicorn worker startup (one-shot import path log) and resolved on its own; the running container is healthy.
  • Container Up (healthy). /health reports qdrant, postgres, openai all OK.

7. Rollback

ssh root@38.242.240.89
cd /opt/incomex/docker/agent-data-repo
git revert 62653d0
cd /opt/incomex/docker
docker compose build agent-data && docker compose up -d agent-data

No DB change. No nginx change.

8. Risks / notes

  • R1. Clients caching tools/list from a prior session will see "Unknown tool" or "not allowed" errors until they refresh. The version bump signals this. Risk is low because most clients fetch the list at session start.
  • R2. MCP_GPT_FULL_TOOLS count dropping from 11 to 10 means total internal tool surface shrank. No external impact (this profile is not nginx-exposed).
  • R3. The HTTP endpoint POST /documents/{x}/move is still wired but always 501. Acceptable per design; if it later becomes a footgun, remove the route definition in a follow-up.

9. Recommendation — proceed to fresh ChatGPT MCP App test?

Yes — fresh ChatGPT MCP App test is now in scope.

D1 (latency), D2a (content negotiation), and D3 (schema honesty) are all live and verified. The three top blockers identified in Phase 2F-A and refined in Rev2 are closed:

  1. list_documents p95 4.47 s -> 138 ms @ c=5 (D1).
  2. application/json-only transport -> JSON + SSE content negotiation (D2a).
  3. Schema advertised move_document while runtime returned 501 -> tool removed from advertisement (D3).

The remaining design item, D4 vector write double-work, is parked behind the canary gate (Rev2 S5). It does not affect ChatGPT's ability to handshake or read/write — it only affects write latency, which D4 will fix later.

Suggested ChatGPT retest protocol

  1. Start a fresh ChatGPT MCP App conversation (per memory project_gpt_action_gateway_stuck.md — old sessions can stick on the gateway side).
  2. Verify handshake (no "Unexpected content type" failure).
  3. Run a probe sequence: search_knowledge -> list_documents path=knowledge/test/ -> get_document -> patch_document on a test file under knowledge/test/mcp-stress/.
  4. Capture observations in a follow-up KB report.

If the patch UI still hangs:

  • It is not D2a (D2a verified). Open a separate diagnosis ticket. Candidates: gateway sticky session, App UI bug, per-tool budget timeout, or a quirk of the ChatGPT-side Streamable HTTP client.

If it works:

  • Open D4 canary as a separate task; do not bundle.

End of report. No nginx change. No DB change. No restart of postgres/qdrant. No secret rotated. No ChatGPT connection initiated by this task.