Skip to content

Claude Code + Semvec Coding

Semvec's coding stack ships a FastMCP server (semvec.coding.mcp_server) that exposes the CodingEngine as MCP tools, plus two lifecycle hooks (SessionStart, PreCompact) that Claude Code fires automatically. The combination gives Claude Code the strongest persistence path Semvec offers — code memory loads on session start, conversation transcripts are ingested before compaction, and state is saved without you ever calling pss_save by hand.

If you're on Cursor instead, see the Cursor guide. Both editors share the same six MCP tools — Claude Code adds the two automatic hooks on top.

What you get

  • Persistent memory across sessions. The LiteralCache carries decisions, invariants, error patterns, and code-pointer metadata between sessions. Reopening the project on Monday picks up where Friday left off.
  • Automatic lifecycle hooks. Claude Code fires SessionStart when a chat opens and PreCompact before context compaction — Semvec hooks into both. You do not have to instruct the agent to call pss_save() the way you do in Cursor.
  • Anti-resonance. Before the agent proposes a non-trivial change, Claude Code can call pss_check_anti_resonance to ask whether a similar idea has already failed.
  • Compacted context as system-prompt input. A 150–350-token block summarising prior work is exposed via pss_get_context and additionally pre-computed by the SessionStart hook so the first turn of a new session already has it.
  • Six MCP tools, all stdio-transport. Same code path as the Cursor integration — no Claude-Code-specific glue beyond the two hooks.

Prerequisites

  • Claude Code CLI installed and authenticated (/login succeeded at least once).
  • Python 3.10+ on your PATH. Claude Code invokes the MCP server and the hooks as subprocesses.
  • A Python environment with semvec[coding] and a real embedder installed:
pip install "semvec[coding]" sentence-transformers

Semvec refuses to fall back to hash-based pseudo-embeddings; both the server and the hooks hard-fail at start-up if no real embedder is reachable. See Embedders for alternatives (OpenAI, ONNX int8, custom).

Use a project-local venv

A project-local virtualenv keeps Claude Code's subprocesses isolated from your system Python. The settings shown below point Claude Code at the venv's interpreter explicitly so the wrong Python never wins.

Step 1 — Configure .claude/settings.json

Create or extend .claude/settings.json in the project root (or ~/.claude/settings.json for a global setup that applies to every project):

{
  "mcpServers": {
    "semvec": {
      "command": "python",
      "args": ["-m", "semvec.coding.mcp_server"],
      "env": {
        "SEMVEC_STATE_DIR": ".semvec",
        "SEMVEC_WORKSPACE": ".",
        "SEMVEC_EMBED_MODEL": "all-MiniLM-L6-v2",
        "SEMVEC_EMBED_DEVICE": "cpu",
        "SEMVEC_TELEMETRY_QUIET": "1"
      }
    }
  },
  "hooks": {
    "SessionStart": [
      {"command": "python -m semvec.coding.hooks.session_start", "timeout": 10000}
    ],
    "PreCompact": [
      {"command": "python -m semvec.coding.hooks.pre_compact", "timeout": 30000}
    ]
  }
}

Two distinct things are wired up here:

  1. mcpServers.semvec — the FastMCP server. Claude Code spawns it on demand and routes the six pss_* tools through it.
  2. hooks.SessionStart / hooks.PreCompact — the two lifecycle subprocesses. Claude Code invokes them automatically (see Step 2 for what they do).

If you use a project-local venv, swap command for the absolute interpreter path so Claude Code does not pick up a system Python. Apply the same change to both hook commands:

"command": "/abs/path/to/project/.venv/bin/python",
"hooks": {
  "SessionStart": [
    {"command": "/abs/path/to/project/.venv/bin/python -m semvec.coding.hooks.session_start", "timeout": 10000}
  ],
  "PreCompact": [
    {"command": "/abs/path/to/project/.venv/bin/python -m semvec.coding.hooks.pre_compact", "timeout": 30000}
  ]
}

On Windows the path becomes C:\\abs\\path\\to\\project\\.venv\\Scripts\\python.exe (escape the backslashes inside JSON).

Environment variables

Variable Default Purpose
SEMVEC_STATE_DIR .semvec Where the engine persists LiteralCache, code pointers, and snapshots. Commit .semvec/ to git for cross-machine continuity, or add it to .gitignore for per-clone memory.
SEMVEC_WORKSPACE (unset) Repo root — used by pss_register_code to anchor file paths. Defaults to the directory Claude Code launched the subprocess in.
SEMVEC_EMBED_MODEL all-MiniLM-L6-v2 Sentence-transformer model. For German / multilingual repos use paraphrase-multilingual-mpnet-base-v2 (768 d).
SEMVEC_EMBED_DEVICE cpu cpu, cuda, or mps.
SEMVEC_LICENSE_KEY (unset) Pro / Enterprise JWT — bypasses the community-tier rate limits. See Licensing.
SEMVEC_TELEMETRY_QUIET (unset) Set to 1 to silence the one-off telemetry stderr notice on first import.
SEMVEC_TELEMETRY (unset) Set to 0 to disable the anonymous init ping entirely.

The hooks read the same prefix; one place to set them.

Step 2 — What the lifecycle hooks actually do

This is the headline feature compared to Cursor: Claude Code fires the hooks for you, so persistence is not a thing the agent has to remember.

SessionStart — fires when a chat opens

Claude Code writes a JSON payload to the hook's stdin:

{ "session_id": "abc123", "cwd": "/path/to/project" }

The hook (semvec.coding.hooks.session_start):

  1. Resolves SEMVEC_STATE_DIR → opens the CodingEngine against it.
  2. Builds an embedder from SEMVEC_EMBED_MODEL / SEMVEC_EMBED_DEVICE.
  3. Calls engine.load_state() — restores LiteralCache, CodePointerIndex, NegativeAttractorSet from disk.
  4. Calls engine.get_compacted_context(task="session start <session_id>") — produces a token-budgeted Markdown block of phase, top-K relevant memories, code pointers, anti-resonance warnings, invariants, latest test summary.
  5. Writes the result dict ({status, state_loaded, code_pointers, negative_attractors, context, session_id}) to stderr.

Claude Code injects the stderr block into the system prompt of the new session. Concretely: the first user message in the new chat already sees Phase: stability · Turn 47 · 12 memories · 8 code pointers · 3 invariants and the relevant excerpts as context. No tool call needed.

PreCompact — fires before context compaction

Claude Code triggers compaction automatically when the conversation approaches its context budget; you can also fire it manually with /compact. Just before either path proceeds, Claude Code writes:

{
  "session_id": "abc123",
  "transcript_path": "/path/to/transcript.jsonl",
  "cwd": "/path/to/project",
  "trigger": "auto"
}

The hook (semvec.coding.hooks.pre_compact):

  1. Opens the engine the same way as SessionStart.
  2. Calls engine.ingest_transcript(transcript_path) — parses the Claude Code JSONL transcript via TranscriptParser and folds every CONVERSATION / CODE_CHANGE / TEST_FAILURE / ERROR chunk into PSS state.
  3. Calls engine.save_state() — persists .semvec/.
  4. Calls engine.get_compacted_context(task="session <session_id>") — same compaction surface as on session start.
  5. Writes {status, ingested_chunks, context, session_id, trigger} to stderr.

Claude Code uses the stderr block as the carry-over for the post-compaction context window. Net effect: the long history collapses into a compact persistent block, and the session continues without losing the design decisions / error patterns / code pointers it just learned.

Hook output contract

Both hooks pass the original stdin through unchanged on stdout — Claude Code expects that. The result dict goes only to stderr. If you tail the hooks' subprocess output for debugging, watch stderr.

Step 3 — CLAUDE.md for project-level memory rules

Claude Code reads CLAUDE.md (project root) and ~/.claude/CLAUDE.md (global) as system-prompt-level instructions on every conversation in the project — Claude Code's equivalent of a Cursor Rule. Add a Semvec-aware block to CLAUDE.md so the agent uses the MCP tools naturally:

## Semvec persistent memory

This project uses Semvec for persistent coding memory across sessions.
The `SessionStart` and `PreCompact` hooks already load and persist
state for you. In addition:

- Before proposing a non-trivial change, call `pss_check_anti_resonance(proposal=<the change>)` and bail out if it matches a known-bad pattern.
- After a substantive code edit, call `pss_update(text=..., update_type="code_change", file_path=..., signature=...)`.
- After a runtime / test failure, call `pss_record_error(error_text=..., source="test_failure")` (or `"runtime_error"` / `"user_correction"`).
- Treat any invariant returned by `pss_get_context` as authoritative — surface conflicts rather than ignoring them.

Do not narrate these calls — invoke them silently and weave their output into your normal answer.

Drop the rule into version control. Every conversation in the project picks it up automatically.

Hook coverage vs CLAUDE.md

The hooks cover load and save automatically; CLAUDE.md covers the per-turn habits the agent should adopt (anti-resonance check, error recording, code-change capture). Cursor needs all of this in its rule because it has no hook surface.

Step 4 — Sanity-check

Open a fresh Claude Code chat in the project and run the slash command:

/mcp

You should see semvec listed with six tools (pss_get_context, pss_update, pss_check_anti_resonance, pss_register_code, pss_record_error, pss_save). Then ask the chat:

Use pss_get_context to summarise prior work in this repo.

On a fresh state directory the response is a header-only block (Phase: initialization · Turn 0 · 0 memories). After a few real coding turns the same call returns a populated context — and after you /compact once, the block survives the compaction.

To confirm the hooks are wired up correctly, watch the Claude Code log on the next session start:

[hook session_start] state_loaded=true pointers=12 attractors=3

Anything other than state_loaded=true after a session that actually wrote something means the state directory is wrong (see Troubleshooting).

Tool reference

The same six tools as the Cursor integration — Claude Code calls them via stdio MCP transport.

Tool Purpose
pss_get_context(task, invariants?, test_summary?) Token-budgeted summary of prior work — phase, top-K relevant memories, code pointers, anti-resonance warnings, plus your current invariants and the latest test summary.
pss_update(text, update_type, file_path?, signature?) Fold one observation into PSS state. update_typeconversation, code_change, error, test_failure.
pss_check_anti_resonance(proposal) Look the proposal up against past error patterns; returns a warning string when it resembles something that already failed.
pss_register_code(file_path, intent, signature) Record a code pointer ({file_path, intent, signature}) so future pss_get_context calls can surface it.
pss_record_error(error_text, source) Write an error pattern. sourcetest_failure, runtime_error, user_correction.
pss_save() Persist the engine state to SEMVEC_STATE_DIR. The PreCompact hook also auto-persists; this is the explicit flush for end-of-session.

For the full Python API behind the tools see semvec.coding.

Differences from Cursor

Capability Claude Code Cursor
MCP tool registration .claude/settings.json (per-project) or ~/.claude/settings.json (global) .cursor/mcp.json (per-project) or ~/.cursor/mcp.json (global)
SessionStart hook Auto-fireshandle_session_start loads state and pre-computes a context block Not available — replicated through a Cursor Rule that asks the agent to call pss_get_context first
PreCompact hook Auto-fireshandle_pre_compact ingests transcript, persists state, returns compacted context Not available — Cursor manages its own context window opaquely; persistence depends on the rule asking the agent to call pss_save()
Project-level rules CLAUDE.md in project root .cursor/rules/semvec.mdc
@semvec mention Direct MCP-tool call from chat Same — Cursor lets you reference an MCP server by name
State location .semvec/ in project root (default) .semvec/ in project root (default)
Multi-machine team Commit .semvec/ to git, both editors read the same files Same

The hook surface is the headline reason Claude Code is the smoother integration — it pulls persistence out of the agent's instruction set entirely. Cursor needs the agent to remember to call pss_save(); Claude Code does not.

Optional — git pre-push hook for guaranteed persistence

If your team relies on pss_save() running before every push (defence-in-depth, even though PreCompact already persists on every compaction), install a git pre-push hook:

#!/usr/bin/env bash
# .git/hooks/pre-push
python -m semvec.coding.mcp_server <<EOF
{"method":"tools/call","params":{"name":"pss_save","arguments":{}}}
EOF

This runs whenever anyone on the team pushes — a safety net for the rare case where Claude Code's own auto-save did not happen (e.g. a hard kill before PreCompact fired).

End-to-end example session

[SessionStart hook fires automatically]
  → loads .semvec/ state: 12 pointers, 3 attractors
  → context block injected into Claude Code's first system prompt:
    "Phase: stability · Turn 47 · 12 memories · 3 invariants
     Last decision (CP4): Use mpnet 768d for German content
     Known error: dropping JWT subject claim breaks cluster auth tests"

User: I'm picking up the auth-flow refactor. Where did we leave off?
Claude Code: <calls pss_get_context(task="auth-flow refactor")>
  Last session you renamed `authenticate()` to `verify_credentials()`
  in `src/auth/service.py`. Open invariant: never log plaintext
  passwords. Known error pattern: dropping the JWT subject claim
  breaks the cluster auth tests (CP3).

User: Add rate limiting to the login endpoint.
Claude Code: <calls pss_check_anti_resonance(proposal="rate limit login endpoint")>
  No prior error pattern matches. Proposing slowapi-based limiter…
  <writes src/auth/routes.py>
  <calls pss_update(text="added rate limit on POST /login (10/min/IP) via slowapi",
                    update_type="code_change",
                    file_path="src/auth/routes.py",
                    signature="rate_limited(10/min/IP)")>
  Done — 10 requests/min/IP via slowapi, returns 429 with Retry-After.

User: Run the auth tests.
Claude Code: <runs pytest, sees 1 failure>
  <calls pss_record_error(
      error_text="test_concurrent_logins fails: rate limit fires before
                  fixture's parallel runner finishes",
      source="test_failure")>
  Test failed because the limiter doesn't differentiate the fixture's
  parallel workers from real concurrent users…

[Several more turns. Conversation approaches the context budget.]

[PreCompact hook fires automatically]
  → ingests transcript (47 chunks: 31 conversation, 9 code_change, 5 errors, 2 tests)
  → engine.save_state() to .semvec/
  → returns compacted context block to Claude Code
  → Claude Code uses it as the carry-over for the post-compaction window

[Session continues without losing what was learned this turn cycle.]

The next session starts the loop again — SessionStart loads everything the PreCompact hook just persisted, and pss_check_anti_resonance will flag any future "rate limit the login endpoint" proposal that does not handle parallel workers.

Troubleshooting

/mcp shows semvec red. Open the panel's logs. Common causes:

  • ModuleNotFoundError: No module named 'semvec' — Claude Code is calling the wrong Python. Switch "command" in .claude/settings.json to the absolute interpreter path of the venv where you ran pip install "semvec[coding]".
  • RuntimeError: sentence-transformers is required — install it (pip install sentence-transformers). Semvec deliberately has no hash-fallback embedder.
  • OSError: [Errno 2] No such file or directory: '.semvec' — set SEMVEC_STATE_DIR to an absolute path or pre-create the folder once.

Hooks never fire. Confirm Claude Code's settings file is the one you edited (.claude/settings.json in project root vs ~/.claude/settings.json). Run a sanity check:

echo '{"session_id":"test","cwd":"."}' | python -m semvec.coding.hooks.session_start

A correctly-configured hook prints the JSON result to stderr and the input to stdout. Any traceback you see here will also appear in Claude Code's hook log.

Hook fires but state never grows. The pre-compact hook only ingests a transcript when transcript_path exists and is non-empty. On very short sessions (compaction fires before any meaningful turns) this is normal. After a real coding session you should see ingested_chunks > 0 in the hook log.

pss_save() errors with LicenseExpiredError or RateLimitError. Community-tier rate limits (5 QPS sustained / 50 burst) are usually irrelevant for the MCP server pattern but can fire under high pss_update rates from automation. Set SEMVEC_LICENSE_KEY in the mcpServers.semvec.env block (and the hook commands' env if you fork them with custom envs); see Licensing.

SessionStart block appears but Claude Code ignores it. The agent treats the injected block as system context, not a hard rule. Strengthen CLAUDE.md if you need stricter behaviour — e.g. "Treat any invariant from pss_get_context as authoritative; surface conflicts rather than ignore them."

Multi-machine team. Commit .semvec/ to git. Each developer pulls the same LiteralCache and CodingEngine state. Conflicts on .semvec/state.json resolve like any other JSON merge — Semvec's checksum check rejects tampered snapshots, so a bad merge surfaces as a ValueError on next load_state() rather than as silent state drift.

Where to next

  • semvec.coding API reference — full tool semantics and the underlying CodingEngine class.
  • Coding (overview) — the three coding-stack usage paths (MCP, in-process, REST API) and when to pick which.
  • Cursor guide — the same MCP server, no lifecycle hooks.
  • Embedders — pick the right model for your repo language and domain.
  • Licensing — Pro / Enterprise tier bypasses the community rate limits, useful for batch ingestion of large monorepos.