Skip to content

REST API (semvec[api])

The optional semvec[api] extra ships a FastAPI-based HTTP service that exposes every feature-delta feature of the Rust core plus a 4-layer multi-agent coordination stack. It is auth-gated by the bundled Ed25519 JWT licensing system — the same JWT already used for in-process licensing. No password store; no separate API-key table.

pip install "semvec[api]"
semvec serve --host 0.0.0.0 --port 8080
# or programmatically
python -m uvicorn semvec.api:create_app --factory --port 8080

Auth

Send the license JWT via either header:

Authorization: Bearer eyJhbGciOiJFZERTQSI...
# or
X-API-Key: eyJhbGciOiJFZERTQSI...

For local development set SEMVEC_ALLOW_ANONYMOUS=1 to bypass auth entirely; every request is then treated as an anonymous community-tier call.

Persistence

DATABASE_URL controls the SQLAlchemy engine. Default: sqlite:///semvec.db. Postgres is supported by setting e.g. DATABASE_URL=postgresql://user:pw@host/db. The hot semantic state lives in-memory (SessionManager); SQLite stores only session/cluster/member/region/audit metadata.

Endpoint Overview

Layer 1 — Agent Sessions

Method Path Purpose
GET /v1/health liveness + active-session count (no auth)
POST /v1/run single-turn run: retrieve context + optionally store previous answer
POST /v1/store learn from an LLM response
POST /v1/session/create explicit session creation (optional template + policy vectors)
DELETE /v1/session/{id} delete a session
GET /v1/metrics/{id} full metrics snapshot
GET /v1/state/context?session_id=&top_k=&full_first= retrieve relevant memories; each item carries a memory_hash + truncated flag, texts clipped to 500 chars. With full_first=true the top hit is returned ungutted.
GET /v1/session/{id}/memories/{memory_hash} expand a single memory to full text + importance + access_count + timestamp

Layer 1b — Session Control (feature deltas 16–26)

Method Path Feature
POST/DELETE /v1/session/{id}/trigger resonance triggers (keyword + embedding)
POST /v1/session/{id}/anchor drift anchors
GET /v1/session/{id}/anchor_score anchor score + drift threshold
PUT /v1/session/{id}/isolation isolation filter (OPEN/FILTER/QUARANTINE/LOCKDOWN)
POST /v1/session/{id}/isolation/release release quarantine
POST /v1/session/{id}/memory synthetic memory injection
GET /v1/session/{id}/export serialize with SHA-256 checksum
POST /v1/session/{id}/import restore from exported dict
POST /v1/session/{id}/verify behavioral consistency check

Layer 2 — Cluster

Method Path Purpose
POST /v1/cluster/ create cluster (201); aggregation_mode = weighted_average or attention; coupling_factor ∈ [0, 1]
GET /v1/cluster/ list owned clusters
GET /v1/cluster/{id} state + aggregate_vector
DELETE /v1/cluster/{id} tears down backing session too
POST /v1/cluster/{id}/store seed Q&A into shared session
POST /v1/cluster/{id}/run query cluster session (cluster_id == session_id)
POST /v1/cluster/{id}/feedback blend aggregate back into members
POST/DELETE /v1/cluster/{id}/members / {sid} membership CRUD

Layer 3 — Region (Consensus)

Method Path Purpose
POST /v1/region/ create region (201); consensus_threshold, vote_window_seconds
GET /v1/region/ list owned
GET /v1/region/{id} state + last_realignment + recent drift events
DELETE /v1/region/{id} delete region + meta-session
POST/DELETE /v1/region/{id}/clusters / {cid} attach/detach clusters
GET /v1/region/{id}/events?limit=20 recent drift events

Drift events are published internally when /run detects drift on a cluster-backing session. The DriftEventBus fans out to per-region callbacks; a realignment fires when a fraction of members > threshold vote within the rolling window.

Layer 4 — Global Observer

Method Path Purpose
POST /v1/observer/ create or return existing (idempotent per license subject)
GET /v1/observer/summary observer state incl. anomaly_count
POST /v1/observer/sample trigger manual sample
GET /v1/observer/anomalies recent anomalies (newest first)
DELETE /v1/observer/anomalies clear anomaly log
POST/DELETE /v1/observer/regions / {rid} register / unregister region

Anomaly types: cross_cluster_convergence (3+ clusters across ≥2 regions converged to the same non-initialization phase), systemic_drift (>50% of observed clusters show drift indicators), cluster_divergence (cluster interaction_count >3× region average).

Layer 5 — Network (feature deltas 27, 29, 30)

Method Path Purpose
POST /v1/network/transfer semantic delta-vector transfer
POST /v1/network/users/switch switch user partition (saves current, activates target)
GET /v1/network/users/active currently active user
POST /v1/network/users/{id}/serialize serialize user partition
POST /v1/network/consensus propose consensus vector
GET /v1/network/consensus/trust current trust scores per instance

Literal cache

Method Path Purpose
POST /v1/session/{id}/entities store a verbatim code entity (201)
GET /v1/session/{id}/entities?q=&max_results= list / keyword-query
DELETE /v1/session/{id}/entities/{key:path} remove entity

Observability

/metrics exposes Prometheus metrics behind Basic Auth (METRICS_USER / METRICS_PASSWORD env vars). A request middleware collects semvec_requests_total{method, endpoint, status} and semvec_request_duration_seconds{method, endpoint} automatically.

Error handling

All error responses carry a JSON body with a single detail field:

{"detail": "Session not found"}
Status When it fires
400 Malformed state-import payload (/v1/session/{id}/import), unknown aggregation_mode on cluster creation, unknown entity kind on literal-cache store.
401 Missing or invalid license JWT on any route except /v1/health. Also /metrics without valid Basic Auth.
402 License JWT signature is valid but the token is expired. Includes a "renew at …" hint pointing at https://www.semvec.io.
404 Session / cluster / region / observer / entity / memory not found, or caller's license subject does not own the resource (the server does not leak resource existence across tenants).
422 Pydantic validation failure — missing or out-of-range request field. The body conforms to FastAPI's standard {"detail": [{"loc": [...], "msg": "...", "type": "..."}]} shape.
429 Rate-limit exceeded. Response carries Retry-After: 60. Community tier: 5 QPS sustained / 50 burst; Pro: 200 / 2000; Enterprise: unthrottled.
500 Unhandled server error — logged via uvicorn access log with request ID. Investigate server logs.
503 /metrics endpoint hit without METRICS_USER / METRICS_PASSWORD env vars configured.

The detail string on 402 includes the upgrade URL; on 401 it distinguishes between "Missing license token" and "Invalid license: …"; on 404 it tells you whether the session or the specific sub-resource was missing.

Minimal quickstart

import httpx

client = httpx.Client(
    base_url="http://localhost:8080/v1",
    headers={"X-API-Key": "eyJhbGciOiJFZERTQSI..."},
)

run = client.post("/run", json={"message": "What is Kubernetes?"}).json()
sid = run["session_id"]
# feed to your LLM with run["context"] as the system prompt ...
client.post("/store", json={"session_id": sid, "response": "Kubernetes..."})