Skip to content

Quickstart

A one-page tour of the semvec public API. Every snippet runs against a bare pip install "semvec[cortex,coding]" sentence-transformers environment.

1. Core PSS state

import numpy as np
from semvec import SemvecState, SemvecConfig

state = SemvecState(config=SemvecConfig(dimension=384))

for text, embedding in conversation:
    result = state.update(embedding, text)
    print(
        f"phase={result['phase']:14} "
        f"similarity={result['similarity']:.3f} "
        f"beta={result['beta']:.3f} "
        f"norm={result['norm']:.3f}"
    )

# Serialise a full checkpoint (SHA-256 integrity-checked, JSON-safe)
checkpoint = state.to_dict()
restored = SemvecState.from_dict(checkpoint)

See the Core API reference for every available metric and method.

2. Token-reduced LLM context

from semvec.token_reduction import SemvecStateSerializer

serializer = SemvecStateSerializer()
context = serializer.serialize(state, query_text="what did we decide about auth?")
# 150–350-token compact string suitable for an LLM system prompt.

3. Drop-in chat proxy

SemvecChatProxy wraps any callable LLM behind PSS-compressed context and tracks per-turn token accounting.

from semvec.token_reduction import SemvecChatProxy, create_llm_client

llm = create_llm_client("openai")  # reads OPENAI_BASE_URL/MODEL/API_KEY from env
proxy = SemvecChatProxy(llm_call=llm, system_prompt="You are a helpful assistant.")

for question in ["summarise Q3", "compare with Q2", "what was the biggest miss?"]:
    result = proxy.chat(question)
    print(f"turn {result.turn_number} ({result.phase}): {result.response}")

print(proxy.get_summary())

Explicit embedder required

SemvecChatProxy, CodingEngine, LongMemEvalRunner, and SemvecAgent.process_input all require an explicit embedder. There is no hash-based fallback. Pass embedding_service= / embedder= with any object that exposes get_embedding(text) -> np.ndarray and get_dimension() -> int, or install sentence-transformers and let the class build one from SemvecConfig.model_name + SemvecConfig.device.

4. Multi-agent coordination (Cortex)

from semvec.cortex import SemvecAgentNetwork, AttentionAggregation

network = SemvecAgentNetwork(aggregation_strategy=AttentionAggregation())
network.add_local_instance("analyst")
network.add_local_instance("planner")
network.process_input("analyst", "quarterly revenue is up 23%")
state = network.get_network_state()
print(f"active agents: {state['active_instances']}/{state['total_instances']}")

See Cortex API for the full set of aggregation strategies and the ConsensusEngine.

5. Coding-agent compaction

from semvec.coding import CodingEngine

engine = CodingEngine(state_dir="~/.semvec/project-x", embedder=my_embedder)
engine.ingest_transcript("path/to/claude_code_session.jsonl")
context = engine.get_compacted_context(
    "implement password reset flow",
    invariants=["never log plaintext passwords"],
)

6. Claude Code integration (MCP + hooks)

Add to .claude/settings.json:

{
  "mcpServers": {
    "semvec": {
      "command": "python",
      "args": ["-m", "semvec.coding.mcp_server"],
      "env": {
        "SEMVEC_STATE_DIR": ".semvec",
        "SEMVEC_EMBED_MODEL": "all-MiniLM-L6-v2"
      }
    }
  },
  "hooks": {
    "PreCompact":  [{"command": "python -m semvec.coding.hooks.pre_compact",  "timeout": 30000}],
    "SessionStart":[{"command": "python -m semvec.coding.hooks.session_start", "timeout": 10000}]
  }
}

See Coding API for the six MCP tools and their argument shapes.

7. LongMemEval benchmark

.venv/bin/python -m semvec.benchmarks.longmemeval \
    --variant S --multi-pss --temperature 0.0 \
    --embed-device cuda \
    --per-type 10 --n-judges 3 \
    --output results/semvec_full.json

Flags mirror the pss reference. See Benchmarks overview.

Next steps