Licensing¶
Tier selection¶
| Use case | Tier | Notes |
|---|---|---|
| Evaluation, prototyping, open-source side projects | Community | No licence key required. Rate-limited per SemvecState instance. |
| Single product / single deployment, production traffic | Pro | Per-seat licence. Higher throughput; full feature surface. |
| Multi-tenant / multi-deployment, B2B redistribution | Enterprise | Per-deployment licence. SLA, indemnification, dedicated support. |
| Regulated workloads (audit, retention, signed deletion) | Pro or Enterprise | Compliance pack is unavailable on Community. |
Tiers at a glance¶
| Tier | Rate limit | Backends | Retrieval modes | Suitable for |
|---|---|---|---|---|
| Community (no key) | ~5 calls/sec sustained, 50 burst | In-memory only | Base | One human user; bursty test loads. |
| Pro | ~200 calls/sec sustained, 2000 burst | All | Extended | Production service, single team. |
| Enterprise | Unthrottled | All | All | Multi-tenant, regulated, distributed. |
How the rate limit applies — library, REST, Cortex
The tier numbers above are enforced per SemvecState instance by
an in-process token bucket inside the Rust core (see
How the rate limiter works
below). The bucket applies uniformly across every surface that
drives a SemvecState:
- Python library: every
state.update(...)andstate.calculate_*(...)call consumes one token. - REST API (
semvec serve): everyPOST /v1/run,POST /v1/store,POST /v1/session/{id}/*,POST /v1/cluster/.../store,POST /v1/network/peer-transfer(and any other endpoint that touches a session) consumes one token from that session's bucket. An exhausted bucket surfaces as HTTP 429 withRetry-Afterin seconds — the sameRateLimitErrorthe library would raise, mapped by the FastAPI exception handler. - Cortex: each
SemvecAgentowns aSemvecState, so per-agent buckets apply; aggregated network operations consume tokens from each participating agent's bucket.
What the bucket does not do: it is not an HTTP-level cross-session
or cross-process throttle. A client opening N parallel sessions (or
running N worker processes) gets N × the per-state quota — each
SemvecState carries its own bucket. For multi-tenant DoS protection
or per-JWT-subject HTTP rate-limiting, terminate that at a reverse
proxy (nginx limit_req, Envoy local_ratelimit, or an API gateway
keyed on the JWT sub claim).
Tier-specific behaviour: Community uses the 5 QPS / 50 burst
bucket plus a sliding-window probe-defence layer (100/s on
update, 30/s on calculate_*) intended for adversarial workloads;
legitimate Community callers never reach the second layer because the
bucket caps first. Pro uses a 200 QPS / 2000 burst bucket without the
second layer. Enterprise is fully unthrottled — no bucket, no
sliding window. The compliance event-replay path bypasses both
layers regardless of tier.
Workload fit¶
| Workload | Typical calls/sec | Fits Community |
|---|---|---|
| Single conversational user (one turn per 5–30 s) | 0.05 – 0.2 | yes |
| Coding-agent MCP server (per file save) | ~0.1 | yes |
| 50-call quickstart smoke test | inside burst | yes |
pytest suite (20 tests × 5 calls) |
50 burst, then ~5/s sustained | yes |
| Production service, concurrent users | 10 – 50 | no — Pro |
| LOCOMO benchmark replay (~25 k calls) | sustained > 5/s | no — batch, shard, or Pro |
For batch workloads use update_batch(), shard across multiple
SemvecState instances (each has its own bucket), or move to Pro /
Enterprise.
Activating a license¶
Set the environment variable before importing semvec:
Keys are Ed25519-signed JWTs with a 30-day TTL. The verifying public key is baked into the wheel at build time, so verification works fully offline.
How the rate limiter works (developers)¶
A single bucket per SemvecState covers both update() and the
on-demand calculate_* aggregate methods. The throughput budget is the
combined operations-per-second on that state:
state.update(emb, text)consumes one token.state.calculate_fsm(...)/calculate_metrics(...)/calculate_advanced_metrics(...)each consume one token.- The bucket refills at the tier's sustained rate up to the burst cap.
When the bucket is empty, the next call raises RateLimitError with a
retry_after hint. A second per-state safety layer applies on the
Community tier only and is intended for adversarial workloads; legitimate
Community callers never hit it because the bucket caps first. The
compliance event-replay path bypasses both layers (replay must not lock
itself out re-folding its own log).
Claims schema¶
products: array of strings naming the products this key unlocks.tier:"Community","Pro", or"Enterprise".exp: Unix timestamp (seconds) when the key expires.
Missing product, wrong signature, and expired timestamps all produce descriptive errors.
Error handling¶
from semvec import RateLimitError, LicenseExpiredError
try:
result = state.update(embedding, text)
except RateLimitError as e:
# e.retry_after is a datetime.timedelta
time.sleep(e.retry_after.total_seconds())
result = state.update(embedding, text)
except LicenseExpiredError as e:
logger.warning("semvec license expired — renew at %s", e.upgrade_url)
raise
Both exceptions inherit from LicenseError, which inherits from the
base SemvecError. See Troubleshooting
for the full symptom table.
For regulated deployments¶
Need offline license validation or a custom public-key rotation
schedule? Contact vertrieb@versino.de for Enterprise options
including:
- Air-gapped license issuance
- Custom TTL policies
- Hardware-backed signing
- SBOM + provenance attestations