Concepts & Glossary¶
What update() returns¶
Every call to state.update(embedding, text) returns a metric dict and updates rolling histories on the state object (beta_history, similarity_history, …). You consume these to drive UI, dispatch, or logging.
| Key | What it means for your app | Typical range |
|---|---|---|
similarity |
How close the new input is to the current state, before the update. Low values mean the user just changed direction. | [-1, 1] |
beta |
How much of the previous state survives this turn. High = stable, low = absorbing aggressively. Treat as an opaque indicator. | (0, 1) |
pattern_strength |
How strongly retrieved memories pulled the new state. Higher = more memory influence. | [0, ~1.5] |
norm |
L2 norm of the state vector after the update. Stays bounded automatically. | [0, 1.2] |
fsm |
Stability score in [0, 1]. Low = state is oscillating, high = converged. Useful for dispatching: gate expensive actions on fsm > 0.7. |
[0, 1] |
phase |
One of six labels (see below). Use it to switch prompts, log session breakpoints, or skip retrieval when the state is still warming up. | enum |
topic_switch |
Magnitude of a detected topic switch. Non-zero = the user just pivoted. | [0, 1] |
novelty_score |
How surprising the new input was. High novelty boosts attention to the input on subsequent turns. | [0, 1] |
The six conversation phases¶
Phase detection is fully automatic — you do not configure it, you just consume it. Read it from result["phase"] after every update.
| Phase | What you might do when you see it |
|---|---|
initialization |
Skip "summarise prior work" prompts — there is none yet. |
exploration |
Lean on the LLM's general knowledge; retrieval has little to add. |
convergence |
Start surfacing relevant prior context aggressively. |
resonance |
Cheap turn — short context block is fine. |
stability |
Promote checkpoint persistence; this is a good moment to save. |
instability |
Consider injecting drift anchors or letting the user clarify. |
The detector decides phases internally from interaction history. Treat the output as an opaque classification; do not assume any specific transition rule.
Phase changes are tracked in state.phase_history. The current phase is always state.phase_detector.current_phase.
Memory tiers¶
state.memory is a three-tier MultiResolutionMemory:
| Tier | Default capacity | Promotion rule |
|---|---|---|
| Short-term | 15 slots | every turn lands here |
| Medium-term | 50 slots | promoted on access + importance |
| Long-term | 200 slots | consolidated clusters; built up gradually |
Capacities are configurable via SemvecConfig(short_term_size=…, medium_term_size=…, long_term_size=…).
Selective forgetting¶
When a tier overflows, Semvec keeps memories with the higher retention score — a composite scoring function that takes importance, recency, and access count into account. A frequently-accessed older memory therefore survives over a never-touched newer one. The exact weighting is tuned for production workloads and is not user-configurable.
use_selective_forgetting=False falls back to FIFO if you genuinely want
pure recency.
NegativeAttractor¶
NegativeAttractor — an internal stability safeguard against pathological
state drift. Configurable via negative_attractor_penalty (default 0.5);
only change after running benchmarks.
Retrieval¶
state.memory.get_relevant_memories(query_vec, top_k=N) returns the most
relevant memories across all three tiers. Scoring takes cosine similarity
and per-tier weighting into account, with optional anchor / trigger boosts.
| Knob | Default | What it does |
|---|---|---|
short_term_weight |
1.0 |
scoring weight for the most recent tier |
medium_term_weight |
0.95 |
medium-term tier — almost flat with short-term |
long_term_weight |
0.9 |
long-term tier — kept competitive so older domains stay reachable |
cluster_fallback_threshold |
0.85 |
controls retrieval breadth for uncertain matches. Higher values keep older domains reachable; lower values stay narrow. |
anchor_retrieval_boost (α) |
0.6 |
scoring boost applied when registered anchors align with the candidate; tune in [0.1, 0.6] |
trigger_retrieval_boost (γ) |
0.3 |
scoring boost applied when a registered ResonanceTrigger matches; tune in [0.1, 0.6] |
Anchor and trigger boosts are combined so that redundant matches do not double-count. Exact composition is implementation-defined; you only see the user-visible effect through retrieval order.
Anchors and triggers¶
Two complementary tools for shaping retrieval.
Drift anchors¶
Reference embeddings that pull retrieval toward known domains:
state.add_anchor(embed("SAP Business One Service Layer OData REST API"))
state.add_anchor(embed("italienische Kueche Kochen Pasta Pizza"))
After a few turns, candidate memories that align with one of your anchors win the tie-break against generic phrases. Register one anchor per domain you care about. The current alignment score is exposed as state.anchor_score (mean cosine of state vs all anchors); when the score falls below drift_threshold, realignment begins over the next few turns. Exact realignment dynamics are implementation-defined.
auto_anchor_on_topic_switch=True (opt-in) snapshots semantic_state as a fresh anchor whenever a topic switch fires, capped by max_auto_anchors (default 8). Useful when your domain has clean topic boundaries; off by default because it tends to capture per-turn noise on real-world embeddings.
Resonance triggers¶
Boost memories on a specific keyword or vector match:
from semvec import ResonanceTrigger
state.add_resonance_trigger(ResonanceTrigger(
keyword="security review",
embedding=embed("security audit threat model"),
threshold=0.7,
))
A trigger fires when either:
- the trigger's keyword appears as a substring in the input text, OR
- cosine of the input embedding to trigger_embedding ≥ threshold.
When a trigger fires, matched memories receive the trigger retrieval boost and the input is treated as high-salience for the current turn. Exact effect on the state update is implementation-defined.
Choosing between them¶
| Goal | Use |
|---|---|
| Bias retrieval toward known domains, prototype-style | Anchors (one per domain) |
| Boost memories on a specific keyword or hard-match phrase | Triggers (keyword) |
| Boost memories whose embeddings are near a reference point but the user has no specific keyword | Triggers (embedding + threshold) |
| Both anchor-style and keyword-style signals | Anchors + Triggers — safe to combine; composition is implementation-defined |
When in doubt: start with anchors only and add triggers later if you
have a clear keyword or embedding cue separate from your anchor
prototypes. Defaults for both boosts are user-tunable on
SemvecConfig.
Topic-switch detection¶
When enable_topic_switch=True (default), Semvec watches for the user
pivoting to a different topic. Every detected switch lands on
state.topic_switch_history with {timestamp, magnitude, phase,
auto_anchored} — bounded list, useful for diagnostics and UI cues
regardless of whether you opt in to auto-anchoring.
The detector exposes two coarse tunables on SemvecConfig
(topic_switch_threshold, topic_switch_window). Defaults are calibrated
for production workloads; treat them as opaque knobs and only adjust if
you observe the detector is consistently too sensitive or too dull on
your data.
Persistence¶
Two persistence formats; both round-trip the full state including memories, anchors, topic-switch history, and the entire LiteralCache.
| Format | Use for | Pros | Cons |
|---|---|---|---|
to_dict() / from_dict() |
Systems that only speak JSON | Human-readable, JSON-safe | Largest |
to_bytes(compress=True) |
Cold-storage checkpoints | ~ 2.4× smaller than JSON | Slowest (gzip cost) |
to_bytes(compress=False) |
Hot-path persistence | Same size as JSON, only ~ 1.9× slower than json.dumps |
Binary (still self-describing + corruption-checked) |
Both formats include an integrity checksum; tampered snapshots raise StateCorruptionError on restore rather than corrupting silently.
The LiteralCache¶
state.literal_cache is a structured-memory layer for things that should survive verbatim across sessions: design decisions, invariants, recurring error patterns with fixes, per-checkpoint test diffs, and parsed code structures. The full surface is documented in Coding API. The headline method is build_handoff_context(next_checkpoint) — produces a Markdown block ready to paste into the next session's system prompt.
Embedding interface¶
Every API that takes an embedder= parameter expects an object exposing two methods:
embedder.get_embedding(text: str) -> np.ndarray # shape (dimension,), preferably L2-normalised
embedder.get_dimension() -> int # must match SemvecConfig.dimension
See the embedders guide for ready-made wrappers (SentenceTransformers, OpenAI, ONNX int8).
Further reading¶
- Quickstart — three-line examples for every surface
- Embedders — pick the right model
- REST API — every endpoint
- Coding API — full
LiteralCachesurface and the MCP tools