Claude Code + Semvec Coding¶
Semvec's coding stack ships a FastMCP server (semvec.coding.mcp_server) that exposes the CodingEngine as MCP tools, plus two lifecycle hooks (SessionStart, PreCompact) that Claude Code fires automatically. The combination gives Claude Code the strongest persistence path Semvec offers — code memory loads on session start, conversation transcripts are ingested before compaction, and state is saved without you ever calling pss_save by hand.
If you're on Cursor instead, see the Cursor guide. Both editors share the same six MCP tools — Claude Code adds the two automatic hooks on top.
What you get¶
- Persistent memory across sessions. The
LiteralCachecarries decisions, invariants, error patterns, and code-pointer metadata between sessions. Reopening the project on Monday picks up where Friday left off. - Automatic lifecycle hooks. Claude Code fires
SessionStartwhen a chat opens andPreCompactbefore context compaction — Semvec hooks into both. You do not have to instruct the agent to callpss_save()the way you do in Cursor. - Anti-resonance. Before the agent proposes a non-trivial change, Claude Code can call
pss_check_anti_resonanceto ask whether a similar idea has already failed. - Compacted context as system-prompt input. A 150–350-token block summarising prior work is exposed via
pss_get_contextand additionally pre-computed by theSessionStarthook so the first turn of a new session already has it. - Six MCP tools, all stdio-transport. Same code path as the Cursor integration — no Claude-Code-specific glue beyond the two hooks.
Prerequisites¶
- Claude Code CLI installed and authenticated (
/loginsucceeded at least once). - Python 3.10+ on your
PATH. Claude Code invokes the MCP server and the hooks as subprocesses. - A Python environment with
semvec[coding]and a real embedder installed:
Semvec refuses to fall back to hash-based pseudo-embeddings; both the server and the hooks hard-fail at start-up if no real embedder is reachable. See Embedders for alternatives (OpenAI, ONNX int8, custom).
Use a project-local venv
A project-local virtualenv keeps Claude Code's subprocesses isolated from your system Python. The settings shown below point Claude Code at the venv's interpreter explicitly so the wrong Python never wins.
Quick alternative: register via the Claude Code CLI
Instead of hand-editing .claude/settings.json, you can let the
Claude Code CLI write the entry for you. Run once from the
project root:
Verify:
On a successful connection, claude mcp list shows the server
with status ✓ Connected. The CLI writes the same mcpServers
block documented below; reach for the JSON path in Step 1 when
you need finer control over env, command, or the lifecycle
hooks.
Step 1 — Configure .claude/settings.json¶
Create or extend .claude/settings.json in the project root (or ~/.claude/settings.json for a global setup that applies to every project):
{
"mcpServers": {
"semvec": {
"command": "python",
"args": ["-m", "semvec.coding.mcp_server"],
"env": {
"SEMVEC_STATE_DIR": ".semvec",
"SEMVEC_WORKSPACE": ".",
"SEMVEC_EMBED_MODEL": "all-MiniLM-L6-v2",
"SEMVEC_EMBED_DEVICE": "cpu"
}
}
},
"hooks": {
"SessionStart": [
{"command": "python -m semvec.coding.hooks.session_start", "timeout": 10000}
],
"PreCompact": [
{"command": "python -m semvec.coding.hooks.pre_compact", "timeout": 30000}
]
}
}
Two distinct things are wired up here:
mcpServers.semvec— the FastMCP server. Claude Code spawns it on demand and routes the sixpss_*tools through it.hooks.SessionStart/hooks.PreCompact— the two lifecycle subprocesses. Claude Code invokes them automatically (see Step 2 for what they do).
If you use a project-local venv, swap command for the absolute interpreter path so Claude Code does not pick up a system Python. Apply the same change to both hook commands:
"command": "/abs/path/to/project/.venv/bin/python",
"hooks": {
"SessionStart": [
{"command": "/abs/path/to/project/.venv/bin/python -m semvec.coding.hooks.session_start", "timeout": 10000}
],
"PreCompact": [
{"command": "/abs/path/to/project/.venv/bin/python -m semvec.coding.hooks.pre_compact", "timeout": 30000}
]
}
On Windows the path becomes C:\\abs\\path\\to\\project\\.venv\\Scripts\\python.exe (escape the backslashes inside JSON).
Alternative: launch via uv
If the project is managed with uv,
let uv run resolve the project's interpreter on the fly so the
right venv is always picked — no absolute path needed:
{
"mcpServers": {
"semvec": {
"command": "uv",
"args": ["run", "python", "-m", "semvec.coding.mcp_server"],
"env": {
"SEMVEC_STATE_DIR": ".semvec",
"SEMVEC_WORKSPACE": ".",
"SEMVEC_EMBED_MODEL": "all-MiniLM-L6-v2",
"SEMVEC_EMBED_DEVICE": "cpu"
}
}
},
"hooks": {
"SessionStart": [
{"command": "uv run python -m semvec.coding.hooks.session_start", "timeout": 10000}
],
"PreCompact": [
{"command": "uv run python -m semvec.coding.hooks.pre_compact", "timeout": 30000}
]
}
}
uv has to be on Claude Code's PATH for this to resolve. The
advantage over a hard-coded interpreter path: uv run
auto-discovers the project's pyproject.toml and works
cross-platform without escaping Windows backslashes.
Environment variables¶
| Variable | Default | Purpose |
|---|---|---|
SEMVEC_STATE_DIR |
.semvec |
Where the engine persists LiteralCache, code pointers, and snapshots. Commit .semvec/ to git for cross-machine continuity, or add it to .gitignore for per-clone memory. |
SEMVEC_WORKSPACE |
(unset) | Repo root — used by pss_register_code to anchor file paths. Defaults to the directory Claude Code launched the subprocess in. |
SEMVEC_EMBED_MODEL |
all-MiniLM-L6-v2 |
Sentence-transformer model. For German / multilingual repos use paraphrase-multilingual-mpnet-base-v2 (768 d). |
SEMVEC_EMBED_DEVICE |
cpu |
cpu, cuda, or mps. |
SEMVEC_LICENSE_KEY |
(unset) | Pro / Enterprise JWT — bypasses the community-tier rate limits. See Licensing. |
The hooks read the same prefix; one place to set them. semvec does not phone home — there are no telemetry-related environment variables.
Startup timeout (important on WSL2)¶
The MCP server loads a sentence-embedding model on every cold start.
On a fresh install the weights are fetched from the network once;
afterwards they sit in the local Hugging Face cache
(~/.cache/huggingface/hub/). Even with a warm cache, the first
connect of a new session can take 60–150 seconds on some systems —
WSL2 with the project sitting under /mnt/c/... is the worst case,
because every Python module loads through the slow NTFS bridge
between WSL2 and Windows.
Claude Code uses two separate timeouts on the MCP path, and they have different defaults:
| Env var | Default | What it covers |
|---|---|---|
MCP_TIMEOUT |
30000 ms |
Tool calls and the initial connect when Claude Code first spawns the server. |
MCP_CONNECT_TIMEOUT_MS |
5000 ms |
Reconnect operations only — used by /mcp reconnect and any automatic reconnect after a transport drop. |
On slow filesystems the cold-start embedder load comfortably exceeds
both defaults, so both variables have to be raised. Setting only
MCP_TIMEOUT is the most common failure mode: the initial connect
works, but /mcp reconnect still trips the 5-second default.
Set both in the shell that launches Claude Code:
# Add to ~/.bashrc or ~/.zshrc:
export MCP_TIMEOUT=180000 # 180 seconds, in milliseconds
export MCP_CONNECT_TIMEOUT_MS=180000 # same value — covers reconnects
# Apply, verify, and (re-)launch:
source ~/.bashrc
echo "$MCP_TIMEOUT / $MCP_CONNECT_TIMEOUT_MS" # expect: 180000 / 180000
exit # if a Claude Code session is open
claude
env in ~/.claude/settings.json is not enough
The env block under mcpServers.semvec only governs the
environment of the server subprocess — it does not affect
Claude Code's own timeouts. Both MCP_TIMEOUT and
MCP_CONNECT_TIMEOUT_MS must be set in the parent shell from
which Claude Code itself is launched.
Verifying the connection¶
Inside Claude Code:
A working setup shows semvec ✓ Connected. The first connect of a
new session can legitimately take 1–2 minutes on slow filesystems;
subsequent connects in the same session are near-instant because
the embedder is already loaded.
When the reconnect fails¶
If /mcp reconnect errors out with -32001 (Request Timeout) after
~5 seconds, the cause is almost always
MCP_CONNECT_TIMEOUT_MS defaulting to 5000. Confirm both
variables are set:
If MCP_CONNECT_TIMEOUT_MS is empty, add the export to
~/.bashrc / ~/.zshrc (see the table above), source the file,
exit Claude Code, and relaunch — MCP_CONNECT_TIMEOUT_MS is picked
up on the next process start. A full restart also bypasses any
transient transport state the panel may be holding on to.
Performance note for WSL2 users¶
When the project directory sits under /mnt/c/..., every Python
import on startup goes through the slow NTFS bridge between WSL2 and
Windows. Moving the project to the native Linux filesystem
(~/dev/...) typically drops MCP-server startup time below 5
seconds and removes the need for the MCP_TIMEOUT /
MCP_CONNECT_TIMEOUT_MS workaround altogether.
Step 2 — What the lifecycle hooks actually do¶
This is the headline feature compared to Cursor: Claude Code fires the hooks for you, so persistence is not a thing the agent has to remember.
SessionStart — fires when a chat opens¶
Claude Code writes a JSON payload to the hook's stdin:
The hook (semvec.coding.hooks.session_start):
- Resolves
SEMVEC_STATE_DIR→ opens theCodingEngineagainst it. - Builds an embedder from
SEMVEC_EMBED_MODEL/SEMVEC_EMBED_DEVICE. - Calls
engine.load_state()— restoresLiteralCache,CodePointerIndex,NegativeAttractorSetfrom disk. - Calls
engine.get_compacted_context(task="session start <session_id>")— produces a token-budgeted Markdown block of phase, top-K relevant memories, code pointers, anti-resonance warnings, invariants, latest test summary. - Writes the result dict (
{status, state_loaded, code_pointers, negative_attractors, context, session_id}) to stderr.
Claude Code injects the stderr block into the system prompt of the new session. Concretely: the first user message in the new chat already sees Phase: stability · Turn 47 · 12 memories · 8 code pointers · 3 invariants and the relevant excerpts as context. No tool call needed.
PreCompact — fires before context compaction¶
Claude Code triggers compaction automatically when the conversation approaches its context budget; you can also fire it manually with /compact. Just before either path proceeds, Claude Code writes:
{
"session_id": "abc123",
"transcript_path": "/path/to/transcript.jsonl",
"cwd": "/path/to/project",
"trigger": "auto"
}
The hook (semvec.coding.hooks.pre_compact):
- Opens the engine the same way as
SessionStart. - Calls
engine.ingest_transcript(transcript_path)— parses the Claude Code JSONL transcript viaTranscriptParserand folds everyCONVERSATION/CODE_CHANGE/TEST_FAILURE/ERRORchunk into PSS state. - Calls
engine.save_state()— persists.semvec/. - Calls
engine.get_compacted_context(task="session <session_id>")— same compaction surface as on session start. - Writes
{status, ingested_chunks, context, session_id, trigger}to stderr.
Claude Code uses the stderr block as the carry-over for the post-compaction context window. Net effect: the long history collapses into a compact persistent block, and the session continues without losing the design decisions / error patterns / code pointers it just learned.
Hook output contract
Both hooks pass the original stdin through unchanged on stdout — Claude Code expects that. The result dict goes only to stderr. If you tail the hooks' subprocess output for debugging, watch stderr.
Step 3 — CLAUDE.md for project-level memory rules¶
Claude Code reads CLAUDE.md (project root) and ~/.claude/CLAUDE.md (global) as system-prompt-level instructions on every conversation in the project — Claude Code's equivalent of a Cursor Rule. Add a Semvec-aware block to CLAUDE.md so the agent uses the MCP tools naturally:
## Semvec persistent memory
This project uses Semvec for persistent coding memory across sessions.
The `SessionStart` and `PreCompact` hooks already load and persist
state for you. In addition:
- Before proposing a non-trivial change, call `pss_check_anti_resonance(proposal=<the change>)` and bail out if it matches a known-bad pattern.
- After a substantive code edit, call `pss_update(text=..., update_type="code_change", file_path=..., signature=...)`.
- After a runtime / test failure, call `pss_record_error(error_text=..., source="test_failure")` (or `"runtime_error"` / `"user_correction"`).
- Treat any invariant returned by `pss_get_context` as authoritative — surface conflicts rather than ignoring them.
Do not narrate these calls — invoke them silently and weave their output into your normal answer.
Drop the rule into version control. Every conversation in the project picks it up automatically.
Hook coverage vs CLAUDE.md
The hooks cover load and save automatically; CLAUDE.md covers the per-turn habits the agent should adopt (anti-resonance check, error recording, code-change capture). Cursor needs all of this in its rule because it has no hook surface.
Step 4 — Sanity-check¶
Open a fresh Claude Code chat in the project and run the slash command:
You should see semvec listed with six tools (pss_get_context, pss_update, pss_check_anti_resonance, pss_register_code, pss_record_error, pss_save). Then ask the chat:
On a fresh state directory the response is a header-only block (Phase: initialization · Turn 0 · 0 memories). After a few real coding turns the same call returns a populated context — and after you /compact once, the block survives the compaction.
To confirm the hooks are wired up correctly, watch the Claude Code log on the next session start:
Anything other than state_loaded=true after a session that actually wrote something means the state directory is wrong (see Troubleshooting).
Tool reference¶
The same six tools as the Cursor integration — Claude Code calls them via stdio MCP transport.
| Tool | Purpose |
|---|---|
pss_get_context(task, invariants?, test_summary?) |
Token-budgeted summary of prior work — phase, top-K relevant memories, code pointers, anti-resonance warnings, plus your current invariants and the latest test summary. |
pss_update(text, update_type, file_path?, signature?) |
Fold one observation into PSS state. update_type ∈ conversation, code_change, error, test_failure. |
pss_check_anti_resonance(proposal) |
Look the proposal up against past error patterns; returns a warning string when it resembles something that already failed. |
pss_register_code(file_path, intent, signature) |
Record a code pointer ({file_path, intent, signature}) so future pss_get_context calls can surface it. |
pss_record_error(error_text, source) |
Write an error pattern. source ∈ test_failure, runtime_error, user_correction. |
pss_save() |
Persist the engine state to SEMVEC_STATE_DIR. The PreCompact hook also auto-persists; this is the explicit flush for end-of-session. |
For the full Python API behind the tools see semvec.coding.
Differences from Cursor¶
| Capability | Claude Code | Cursor |
|---|---|---|
| MCP tool registration | .claude/settings.json (per-project) or ~/.claude/settings.json (global) |
.cursor/mcp.json (per-project) or ~/.cursor/mcp.json (global) |
SessionStart hook |
Auto-fires — handle_session_start loads state and pre-computes a context block |
Not available — replicated through a Cursor Rule that asks the agent to call pss_get_context first |
PreCompact hook |
Auto-fires — handle_pre_compact ingests transcript, persists state, returns compacted context |
Not available — Cursor manages its own context window opaquely; persistence depends on the rule asking the agent to call pss_save() |
| Project-level rules | CLAUDE.md in project root |
.cursor/rules/semvec.mdc |
@semvec mention |
Direct MCP-tool call from chat | Same — Cursor lets you reference an MCP server by name |
| State location | .semvec/ in project root (default) |
.semvec/ in project root (default) |
| Multi-machine team | Commit .semvec/ to git, both editors read the same files |
Same |
The hook surface is the headline reason Claude Code is the smoother integration — it pulls persistence out of the agent's instruction set entirely. Cursor needs the agent to remember to call pss_save(); Claude Code does not.
Optional — git pre-push hook for guaranteed persistence¶
If your team relies on pss_save() running before every push (defence-in-depth, even though PreCompact already persists on every compaction), install a git pre-push hook:
#!/usr/bin/env bash
# .git/hooks/pre-push
python -m semvec.coding.mcp_server <<EOF
{"method":"tools/call","params":{"name":"pss_save","arguments":{}}}
EOF
This runs whenever anyone on the team pushes — a safety net for the rare case where Claude Code's own auto-save did not happen (e.g. a hard kill before PreCompact fired).
End-to-end example session¶
[SessionStart hook fires automatically]
→ loads .semvec/ state: 12 pointers, 3 attractors
→ context block injected into Claude Code's first system prompt:
"Phase: stability · Turn 47 · 12 memories · 3 invariants
Last decision (CP4): Use mpnet 768d for German content
Known error: dropping JWT subject claim breaks cluster auth tests"
User: I'm picking up the auth-flow refactor. Where did we leave off?
Claude Code: <calls pss_get_context(task="auth-flow refactor")>
Last session you renamed `authenticate()` to `verify_credentials()`
in `src/auth/service.py`. Open invariant: never log plaintext
passwords. Known error pattern: dropping the JWT subject claim
breaks the cluster auth tests (CP3).
User: Add rate limiting to the login endpoint.
Claude Code: <calls pss_check_anti_resonance(proposal="rate limit login endpoint")>
No prior error pattern matches. Proposing slowapi-based limiter…
<writes src/auth/routes.py>
<calls pss_update(text="added rate limit on POST /login (10/min/IP) via slowapi",
update_type="code_change",
file_path="src/auth/routes.py",
signature="rate_limited(10/min/IP)")>
Done — 10 requests/min/IP via slowapi, returns 429 with Retry-After.
User: Run the auth tests.
Claude Code: <runs pytest, sees 1 failure>
<calls pss_record_error(
error_text="test_concurrent_logins fails: rate limit fires before
fixture's parallel runner finishes",
source="test_failure")>
Test failed because the limiter doesn't differentiate the fixture's
parallel workers from real concurrent users…
[Several more turns. Conversation approaches the context budget.]
[PreCompact hook fires automatically]
→ ingests transcript (47 chunks: 31 conversation, 9 code_change, 5 errors, 2 tests)
→ engine.save_state() to .semvec/
→ returns compacted context block to Claude Code
→ Claude Code uses it as the carry-over for the post-compaction window
[Session continues without losing what was learned this turn cycle.]
The next session starts the loop again — SessionStart loads everything the PreCompact hook just persisted, and pss_check_anti_resonance will flag any future "rate limit the login endpoint" proposal that does not handle parallel workers.
Troubleshooting¶
/mcp shows semvec red.
Open the panel's logs. Common causes:
ModuleNotFoundError: No module named 'semvec'— Claude Code is calling the wrong Python. Switch"command"in.claude/settings.jsonto the absolute interpreter path of the venv where you ranpip install "semvec[coding]", or use theuv runvariant shown above.ModuleNotFoundError: No module named 'fastmcp'/'sentence_transformers'— you installed baresemvecwithout the[coding]extra. Re-runpip install "semvec[coding]"; both packages are part of that extra. If you launch throughuv run, runuv pip install 'semvec[coding]'in the project'suvenvironment.connection timed out after 30000ms— the initial connect timeout is too short for your filesystem (typical on WSL2 with the project under/mnt/c/). ExportMCP_TIMEOUT=180000in~/.bashrc/~/.zshrcand restart Claude Code from that shell. See Startup timeout above.Failed to reconnect ... -32001 (Request Timeout)after ~5 seconds — that is the reconnect timeout, which uses a separateMCP_CONNECT_TIMEOUT_MSenv var with a5000ms default. Setting onlyMCP_TIMEOUTis not enough; exportMCP_CONNECT_TIMEOUT_MS=180000as well in the same shell. Verify withecho "$MCP_TIMEOUT / $MCP_CONNECT_TIMEOUT_MS".Failed to connectwith no further details in the panel — the server probably crashed before the MCP handshake. Run it by hand to see the real traceback:uv run python -m semvec.coding.mcp_server(or your venv'spython -m semvec.coding.mcp_server).- Startup consistently > 2 minutes on WSL2 — the project sits under
/mnt/c/, every import crosses the NTFS bridge. Move the project to a native Linux filesystem (~/dev/...); startup drops below 5 seconds and both timeout workarounds become unnecessary. OSError: [Errno 2] No such file or directory: '.semvec'— setSEMVEC_STATE_DIRto an absolute path or pre-create the folder once.
Hooks never fire.
Confirm Claude Code's settings file is the one you edited (.claude/settings.json in project root vs ~/.claude/settings.json). Run a sanity check:
A correctly-configured hook prints the JSON result to stderr and the input to stdout. Any traceback you see here will also appear in Claude Code's hook log.
Hook fires but state never grows.
The pre-compact hook only ingests a transcript when transcript_path exists and is non-empty. On very short sessions (compaction fires before any meaningful turns) this is normal. After a real coding session you should see ingested_chunks > 0 in the hook log.
pss_save() errors with LicenseExpiredError or RateLimitError.
Community-tier rate limits (5 QPS sustained / 50 burst) are usually irrelevant for the MCP server pattern but can fire under high pss_update rates from automation. Set SEMVEC_LICENSE_KEY in the mcpServers.semvec.env block (and the hook commands' env if you fork them with custom envs); see Licensing.
SessionStart block appears but Claude Code ignores it.
The agent treats the injected block as system context, not a hard rule. Strengthen CLAUDE.md if you need stricter behaviour — e.g. "Treat any invariant from pss_get_context as authoritative; surface conflicts rather than ignore them."
Multi-machine team.
Commit .semvec/ to git. Each developer pulls the same LiteralCache and CodingEngine state. Conflicts on .semvec/state.json resolve like any other JSON merge — Semvec's checksum check rejects tampered snapshots, so a bad merge surfaces as a ValueError on next load_state() rather than as silent state drift.
Where to next¶
semvec.codingAPI reference — full tool semantics and the underlyingCodingEngineclass.- Coding (overview) — the three coding-stack usage paths (MCP, in-process, REST API) and when to pick which.
- Cursor guide — the same MCP server, no lifecycle hooks.
- Embedders — pick the right model for your repo language and domain.
- Licensing — Pro / Enterprise tier bypasses the community rate limits, useful for batch ingestion of large monorepos.