Claude Code + Semvec Coding¶

Semvec's coding stack ships a FastMCP server (semvec.coding.mcp_server) that exposes the CodingEngine as MCP tools, plus two lifecycle hooks (SessionStart, PreCompact) that Claude Code fires automatically. The combination gives Claude Code the strongest persistence path Semvec offers — code memory loads on session start, conversation transcripts are ingested before compaction, and state is saved without you ever calling pss_save by hand.

If you're on Cursor instead, see the Cursor guide. Both editors share the same six MCP tools — Claude Code adds the two automatic hooks on top.

What you get¶

Persistent memory across sessions. The LiteralCache carries decisions, invariants, error patterns, and code-pointer metadata between sessions. Reopening the project on Monday picks up where Friday left off.
Automatic lifecycle hooks. Claude Code fires SessionStart when a chat opens and PreCompact before context compaction — Semvec hooks into both. You do not have to instruct the agent to call pss_save() the way you do in Cursor.
Anti-resonance. Before the agent proposes a non-trivial change, Claude Code can call pss_check_anti_resonance to ask whether a similar idea has already failed.
Compacted context as system-prompt input. A 150–350-token block summarising prior work is exposed via pss_get_context and additionally pre-computed by the SessionStart hook so the first turn of a new session already has it.
Six MCP tools, all stdio-transport. Same code path as the Cursor integration — no Claude-Code-specific glue beyond the two hooks.

Prerequisites¶

Claude Code CLI installed and authenticated (/login succeeded at least once).
Python 3.10+ on your PATH. Claude Code invokes the MCP server and the hooks as subprocesses.
A Python environment with semvec[coding] and a real embedder installed:

pip install "semvec[coding]" sentence-transformers

Semvec refuses to fall back to hash-based pseudo-embeddings; both the server and the hooks hard-fail at start-up if no real embedder is reachable. See Embedders for alternatives (OpenAI, ONNX int8, custom).

Use a project-local venv

A project-local virtualenv keeps Claude Code's subprocesses isolated from your system Python. The settings shown below point Claude Code at the venv's interpreter explicitly so the wrong Python never wins.

Quick alternative: register via the Claude Code CLI

Instead of hand-editing .claude/settings.json, you can let the Claude Code CLI write the entry for you. Run once from the project root:

claude mcp add semvec -- uv run python -m semvec.coding.mcp_server

Verify:

claude mcp get semvec
claude mcp list

On a successful connection, claude mcp list shows the server with status ✓ Connected. The CLI writes the same mcpServers block documented below; reach for the JSON path in Step 1 when you need finer control over env, command, or the lifecycle hooks.

Step 1 — Configure `.claude/settings.json`¶

Create or extend .claude/settings.json in the project root (or ~/.claude/settings.json for a global setup that applies to every project):

{
  "mcpServers": {
    "semvec": {
      "command": "python",
      "args": ["-m", "semvec.coding.mcp_server"],
      "env": {
        "SEMVEC_STATE_DIR": ".semvec",
        "SEMVEC_WORKSPACE": ".",
        "SEMVEC_EMBED_MODEL": "all-MiniLM-L6-v2",
        "SEMVEC_EMBED_DEVICE": "cpu"
      }
    }
  },
  "hooks": {
    "SessionStart": [
      {"command": "python -m semvec.coding.hooks.session_start", "timeout": 10000}
    ],
    "PreCompact": [
      {"command": "python -m semvec.coding.hooks.pre_compact", "timeout": 30000}
    ]
  }
}

Two distinct things are wired up here:

mcpServers.semvec — the FastMCP server. Claude Code spawns it on demand and routes the six pss_* tools through it.
hooks.SessionStart / hooks.PreCompact — the two lifecycle subprocesses. Claude Code invokes them automatically (see Step 2 for what they do).

If you use a project-local venv, swap command for the absolute interpreter path so Claude Code does not pick up a system Python. Apply the same change to both hook commands:

"command": "/abs/path/to/project/.venv/bin/python",
"hooks": {
  "SessionStart": [
    {"command": "/abs/path/to/project/.venv/bin/python -m semvec.coding.hooks.session_start", "timeout": 10000}
  ],
  "PreCompact": [
    {"command": "/abs/path/to/project/.venv/bin/python -m semvec.coding.hooks.pre_compact", "timeout": 30000}
  ]
}

On Windows the path becomes C:\\abs\\path\\to\\project\\.venv\\Scripts\\python.exe (escape the backslashes inside JSON).

Alternative: launch via uv

If the project is managed with uv, let uv run resolve the project's interpreter on the fly so the right venv is always picked — no absolute path needed:

{
  "mcpServers": {
    "semvec": {
      "command": "uv",
      "args": ["run", "python", "-m", "semvec.coding.mcp_server"],
      "env": {
        "SEMVEC_STATE_DIR": ".semvec",
        "SEMVEC_WORKSPACE": ".",
        "SEMVEC_EMBED_MODEL": "all-MiniLM-L6-v2",
        "SEMVEC_EMBED_DEVICE": "cpu"
      }
    }
  },
  "hooks": {
    "SessionStart": [
      {"command": "uv run python -m semvec.coding.hooks.session_start", "timeout": 10000}
    ],
    "PreCompact": [
      {"command": "uv run python -m semvec.coding.hooks.pre_compact", "timeout": 30000}
    ]
  }
}

uv has to be on Claude Code's PATH for this to resolve. The advantage over a hard-coded interpreter path: uv run auto-discovers the project's pyproject.toml and works cross-platform without escaping Windows backslashes.

Environment variables¶

Variable	Default	Purpose
`SEMVEC_STATE_DIR`	`.semvec`	Where the engine persists `LiteralCache`, code pointers, and snapshots. Commit `.semvec/` to git for cross-machine continuity, or add it to `.gitignore` for per-clone memory.
`SEMVEC_WORKSPACE`	(unset)	Repo root — used by `pss_register_code` to anchor file paths. Defaults to the directory Claude Code launched the subprocess in.
`SEMVEC_EMBED_MODEL`	`all-MiniLM-L6-v2`	Sentence-transformer model. For German / multilingual repos use `paraphrase-multilingual-mpnet-base-v2` (768 d).
`SEMVEC_EMBED_DEVICE`	`cpu`	`cpu`, `cuda`, or `mps`.
`SEMVEC_LICENSE_KEY`	(unset)	Pro / Enterprise JWT — bypasses the community-tier rate limits. See Licensing.

The hooks read the same prefix; one place to set them. semvec does not phone home — there are no telemetry-related environment variables.

Startup timeout (important on WSL2)¶

The MCP server loads a sentence-embedding model on every cold start. On a fresh install the weights are fetched from the network once; afterwards they sit in the local Hugging Face cache (~/.cache/huggingface/hub/). Even with a warm cache, the first connect of a new session can take 60–150 seconds on some systems — WSL2 with the project sitting under /mnt/c/... is the worst case, because every Python module loads through the slow NTFS bridge between WSL2 and Windows.

Claude Code uses two separate timeouts on the MCP path, and they have different defaults:

Env var	Default	What it covers
`MCP_TIMEOUT`	`30000` ms	Tool calls and the initial connect when Claude Code first spawns the server.
`MCP_CONNECT_TIMEOUT_MS`	`5000` ms	Reconnect operations only — used by `/mcp reconnect` and any automatic reconnect after a transport drop.

On slow filesystems the cold-start embedder load comfortably exceeds both defaults, so both variables have to be raised. Setting only MCP_TIMEOUT is the most common failure mode: the initial connect works, but /mcp reconnect still trips the 5-second default.

Set both in the shell that launches Claude Code:

# Add to ~/.bashrc or ~/.zshrc:
export MCP_TIMEOUT=180000              # 180 seconds, in milliseconds
export MCP_CONNECT_TIMEOUT_MS=180000   # same value — covers reconnects

# Apply, verify, and (re-)launch:
source ~/.bashrc
echo "$MCP_TIMEOUT / $MCP_CONNECT_TIMEOUT_MS"   # expect: 180000 / 180000
exit                                            # if a Claude Code session is open
claude

env in ~/.claude/settings.json is not enough

The env block under mcpServers.semvec only governs the environment of the server subprocess — it does not affect Claude Code's own timeouts. Both MCP_TIMEOUT and MCP_CONNECT_TIMEOUT_MS must be set in the parent shell from which Claude Code itself is launched.

Verifying the connection¶

Inside Claude Code:

/mcp

A working setup shows semvec ✓ Connected. The first connect of a new session can legitimately take 1–2 minutes on slow filesystems; subsequent connects in the same session are near-instant because the embedder is already loaded.

When the reconnect fails¶

If /mcp reconnect errors out with -32001 (Request Timeout) after ~5 seconds, the cause is almost always MCP_CONNECT_TIMEOUT_MS defaulting to 5000. Confirm both variables are set:

echo "$MCP_TIMEOUT / $MCP_CONNECT_TIMEOUT_MS"

If MCP_CONNECT_TIMEOUT_MS is empty, add the export to ~/.bashrc / ~/.zshrc (see the table above), source the file, exit Claude Code, and relaunch — MCP_CONNECT_TIMEOUT_MS is picked up on the next process start. A full restart also bypasses any transient transport state the panel may be holding on to.

Performance note for WSL2 users¶

When the project directory sits under /mnt/c/..., every Python import on startup goes through the slow NTFS bridge between WSL2 and Windows. Moving the project to the native Linux filesystem (~/dev/...) typically drops MCP-server startup time below 5 seconds and removes the need for the MCP_TIMEOUT / MCP_CONNECT_TIMEOUT_MS workaround altogether.

Step 2 — What the lifecycle hooks actually do¶

This is the headline feature compared to Cursor: Claude Code fires the hooks for you, so persistence is not a thing the agent has to remember.

`SessionStart` — fires when a chat opens¶

Claude Code writes a JSON payload to the hook's stdin:

{ "session_id": "abc123", "cwd": "/path/to/project" }

The hook (semvec.coding.hooks.session_start):

Resolves SEMVEC_STATE_DIR → opens the CodingEngine against it.
Builds an embedder from SEMVEC_EMBED_MODEL / SEMVEC_EMBED_DEVICE.
Calls engine.load_state() — restores LiteralCache, CodePointerIndex, NegativeAttractorSet from disk.
Calls engine.get_compacted_context(task="session start <session_id>") — produces a token-budgeted Markdown block of phase, top-K relevant memories, code pointers, anti-resonance warnings, invariants, latest test summary.
Writes the result dict ({status, state_loaded, code_pointers, negative_attractors, context, session_id}) to stderr.

Claude Code injects the stderr block into the system prompt of the new session. Concretely: the first user message in the new chat already sees Phase: stability · Turn 47 · 12 memories · 8 code pointers · 3 invariants and the relevant excerpts as context. No tool call needed.

`PreCompact` — fires before context compaction¶

Claude Code triggers compaction automatically when the conversation approaches its context budget; you can also fire it manually with /compact. Just before either path proceeds, Claude Code writes:

{
  "session_id": "abc123",
  "transcript_path": "/path/to/transcript.jsonl",
  "cwd": "/path/to/project",
  "trigger": "auto"
}

The hook (semvec.coding.hooks.pre_compact):

Opens the engine the same way as SessionStart.
Calls engine.ingest_transcript(transcript_path) — parses the Claude Code JSONL transcript via TranscriptParser and folds every CONVERSATION / CODE_CHANGE / TEST_FAILURE / ERROR chunk into PSS state.
Calls engine.save_state() — persists .semvec/.
Calls engine.get_compacted_context(task="session <session_id>") — same compaction surface as on session start.
Writes {status, ingested_chunks, context, session_id, trigger} to stderr.

Claude Code uses the stderr block as the carry-over for the post-compaction context window. Net effect: the long history collapses into a compact persistent block, and the session continues without losing the design decisions / error patterns / code pointers it just learned.

Hook output contract

Both hooks pass the original stdin through unchanged on stdout — Claude Code expects that. The result dict goes only to stderr. If you tail the hooks' subprocess output for debugging, watch stderr.

Step 3 — `CLAUDE.md` for project-level memory rules¶

Claude Code reads CLAUDE.md (project root) and ~/.claude/CLAUDE.md (global) as system-prompt-level instructions on every conversation in the project — Claude Code's equivalent of a Cursor Rule. Add a Semvec-aware block to CLAUDE.md so the agent uses the MCP tools naturally:

## Semvec persistent memory

This project uses Semvec for persistent coding memory across sessions.
The `SessionStart` and `PreCompact` hooks already load and persist
state for you. In addition:

- Before proposing a non-trivial change, call `pss_check_anti_resonance(proposal=<the change>)` and bail out if it matches a known-bad pattern.
- After a substantive code edit, call `pss_update(text=..., update_type="code_change", file_path=..., signature=...)`.
- After a runtime / test failure, call `pss_record_error(error_text=..., source="test_failure")` (or `"runtime_error"` / `"user_correction"`).
- Treat any invariant returned by `pss_get_context` as authoritative — surface conflicts rather than ignoring them.

Do not narrate these calls — invoke them silently and weave their output into your normal answer.

Drop the rule into version control. Every conversation in the project picks it up automatically.

Hook coverage vs CLAUDE.md

The hooks cover load and save automatically; CLAUDE.md covers the per-turn habits the agent should adopt (anti-resonance check, error recording, code-change capture). Cursor needs all of this in its rule because it has no hook surface.

Step 4 — Sanity-check¶

Open a fresh Claude Code chat in the project and run the slash command:

/mcp

You should see semvec listed with six tools (pss_get_context, pss_update, pss_check_anti_resonance, pss_register_code, pss_record_error, pss_save). Then ask the chat:

Use pss_get_context to summarise prior work in this repo.

On a fresh state directory the response is a header-only block (Phase: initialization · Turn 0 · 0 memories). After a few real coding turns the same call returns a populated context — and after you /compact once, the block survives the compaction.

To confirm the hooks are wired up correctly, watch the Claude Code log on the next session start:

[hook session_start] state_loaded=true pointers=12 attractors=3

Anything other than state_loaded=true after a session that actually wrote something means the state directory is wrong (see Troubleshooting).

Tool reference¶

The same six tools as the Cursor integration — Claude Code calls them via stdio MCP transport.

Tool	Purpose
`pss_get_context(task, invariants?, test_summary?)`	Token-budgeted summary of prior work — phase, top-K relevant memories, code pointers, anti-resonance warnings, plus your current invariants and the latest test summary.
`pss_update(text, update_type, file_path?, signature?)`	Fold one observation into PSS state. `update_type` ∈ `conversation`, `code_change`, `error`, `test_failure`.
`pss_check_anti_resonance(proposal)`	Look the proposal up against past error patterns; returns a warning string when it resembles something that already failed.
`pss_register_code(file_path, intent, signature)`	Record a code pointer (`{file_path, intent, signature}`) so future `pss_get_context` calls can surface it.
`pss_record_error(error_text, source)`	Write an error pattern. `source` ∈ `test_failure`, `runtime_error`, `user_correction`.
`pss_save()`	Persist the engine state to `SEMVEC_STATE_DIR`. The `PreCompact` hook also auto-persists; this is the explicit flush for end-of-session.

For the full Python API behind the tools see semvec.coding.

Differences from Cursor¶

Capability	Claude Code	Cursor
MCP tool registration	`.claude/settings.json` (per-project) or `~/.claude/settings.json` (global)	`.cursor/mcp.json` (per-project) or `~/.cursor/mcp.json` (global)
`SessionStart` hook	Auto-fires — `handle_session_start` loads state and pre-computes a context block	Not available — replicated through a Cursor Rule that asks the agent to call `pss_get_context` first
`PreCompact` hook	Auto-fires — `handle_pre_compact` ingests transcript, persists state, returns compacted context	Not available — Cursor manages its own context window opaquely; persistence depends on the rule asking the agent to call `pss_save()`
Project-level rules	`CLAUDE.md` in project root	`.cursor/rules/semvec.mdc`
`@semvec` mention	Direct MCP-tool call from chat	Same — Cursor lets you reference an MCP server by name
State location	`.semvec/` in project root (default)	`.semvec/` in project root (default)
Multi-machine team	Commit `.semvec/` to git, both editors read the same files	Same

The hook surface is the headline reason Claude Code is the smoother integration — it pulls persistence out of the agent's instruction set entirely. Cursor needs the agent to remember to call pss_save(); Claude Code does not.

Optional — git pre-push hook for guaranteed persistence¶

If your team relies on pss_save() running before every push (defence-in-depth, even though PreCompact already persists on every compaction), install a git pre-push hook:

#!/usr/bin/env bash
# .git/hooks/pre-push
python -m semvec.coding.mcp_server <<EOF
{"method":"tools/call","params":{"name":"pss_save","arguments":{}}}
EOF

This runs whenever anyone on the team pushes — a safety net for the rare case where Claude Code's own auto-save did not happen (e.g. a hard kill before PreCompact fired).

End-to-end example session¶

[SessionStart hook fires automatically]
  → loads .semvec/ state: 12 pointers, 3 attractors
  → context block injected into Claude Code's first system prompt:
    "Phase: stability · Turn 47 · 12 memories · 3 invariants
     Last decision (CP4): Use mpnet 768d for German content
     Known error: dropping JWT subject claim breaks cluster auth tests"

User: I'm picking up the auth-flow refactor. Where did we leave off?
Claude Code: <calls pss_get_context(task="auth-flow refactor")>
  Last session you renamed `authenticate()` to `verify_credentials()`
  in `src/auth/service.py`. Open invariant: never log plaintext
  passwords. Known error pattern: dropping the JWT subject claim
  breaks the cluster auth tests (CP3).

User: Add rate limiting to the login endpoint.
Claude Code: <calls pss_check_anti_resonance(proposal="rate limit login endpoint")>
  No prior error pattern matches. Proposing slowapi-based limiter…
  <writes src/auth/routes.py>
  <calls pss_update(text="added rate limit on POST /login (10/min/IP) via slowapi",
                    update_type="code_change",
                    file_path="src/auth/routes.py",
                    signature="rate_limited(10/min/IP)")>
  Done — 10 requests/min/IP via slowapi, returns 429 with Retry-After.

User: Run the auth tests.
Claude Code: <runs pytest, sees 1 failure>
  <calls pss_record_error(
      error_text="test_concurrent_logins fails: rate limit fires before
                  fixture's parallel runner finishes",
      source="test_failure")>
  Test failed because the limiter doesn't differentiate the fixture's
  parallel workers from real concurrent users…

[Several more turns. Conversation approaches the context budget.]

[PreCompact hook fires automatically]
  → ingests transcript (47 chunks: 31 conversation, 9 code_change, 5 errors, 2 tests)
  → engine.save_state() to .semvec/
  → returns compacted context block to Claude Code
  → Claude Code uses it as the carry-over for the post-compaction window

[Session continues without losing what was learned this turn cycle.]

The next session starts the loop again — SessionStart loads everything the PreCompact hook just persisted, and pss_check_anti_resonance will flag any future "rate limit the login endpoint" proposal that does not handle parallel workers.

Troubleshooting¶

/mcp shows semvec red. Open the panel's logs. Common causes:

ModuleNotFoundError: No module named 'semvec' — Claude Code is calling the wrong Python. Switch "command" in .claude/settings.json to the absolute interpreter path of the venv where you ran pip install "semvec[coding]", or use the uv run variant shown above.
ModuleNotFoundError: No module named 'fastmcp' / 'sentence_transformers' — you installed bare semvec without the [coding] extra. Re-run pip install "semvec[coding]"; both packages are part of that extra. If you launch through uv run, run uv pip install 'semvec[coding]' in the project's uv environment.
connection timed out after 30000ms — the initial connect timeout is too short for your filesystem (typical on WSL2 with the project under /mnt/c/). Export MCP_TIMEOUT=180000 in ~/.bashrc / ~/.zshrc and restart Claude Code from that shell. See Startup timeout above.
Failed to reconnect ... -32001 (Request Timeout) after ~5 seconds — that is the reconnect timeout, which uses a separate MCP_CONNECT_TIMEOUT_MS env var with a 5000 ms default. Setting only MCP_TIMEOUT is not enough; export MCP_CONNECT_TIMEOUT_MS=180000 as well in the same shell. Verify with echo "$MCP_TIMEOUT / $MCP_CONNECT_TIMEOUT_MS".
Failed to connect with no further details in the panel — the server probably crashed before the MCP handshake. Run it by hand to see the real traceback: uv run python -m semvec.coding.mcp_server (or your venv's python -m semvec.coding.mcp_server).
Startup consistently > 2 minutes on WSL2 — the project sits under /mnt/c/, every import crosses the NTFS bridge. Move the project to a native Linux filesystem (~/dev/...); startup drops below 5 seconds and both timeout workarounds become unnecessary.
OSError: [Errno 2] No such file or directory: '.semvec' — set SEMVEC_STATE_DIR to an absolute path or pre-create the folder once.

Hooks never fire. Confirm Claude Code's settings file is the one you edited (.claude/settings.json in project root vs ~/.claude/settings.json). Run a sanity check:

echo '{"session_id":"test","cwd":"."}' | python -m semvec.coding.hooks.session_start

A correctly-configured hook prints the JSON result to stderr and the input to stdout. Any traceback you see here will also appear in Claude Code's hook log.

Hook fires but state never grows. The pre-compact hook only ingests a transcript when transcript_path exists and is non-empty. On very short sessions (compaction fires before any meaningful turns) this is normal. After a real coding session you should see ingested_chunks > 0 in the hook log.

pss_save() errors with LicenseExpiredError or RateLimitError. Community-tier rate limits (5 QPS sustained / 50 burst) are usually irrelevant for the MCP server pattern but can fire under high pss_update rates from automation. Set SEMVEC_LICENSE_KEY in the mcpServers.semvec.env block (and the hook commands' env if you fork them with custom envs); see Licensing.

SessionStart block appears but Claude Code ignores it. The agent treats the injected block as system context, not a hard rule. Strengthen CLAUDE.md if you need stricter behaviour — e.g. "Treat any invariant from pss_get_context as authoritative; surface conflicts rather than ignore them."

Multi-machine team. Commit .semvec/ to git. Each developer pulls the same LiteralCache and CodingEngine state. Conflicts on .semvec/state.json resolve like any other JSON merge — Semvec's checksum check rejects tampered snapshots, so a bad merge surfaces as a ValueError on next load_state() rather than as silent state drift.

Where to next¶

semvec.coding API reference — full tool semantics and the underlying CodingEngine class.
Coding (overview) — the three coding-stack usage paths (MCP, in-process, REST API) and when to pick which.
Cursor guide — the same MCP server, no lifecycle hooks.
Embedders — pick the right model for your repo language and domain.
Licensing — Pro / Enterprise tier bypasses the community rate limits, useful for batch ingestion of large monorepos.